- Music and Audio Processing
- Speech Recognition and Synthesis
- Speech and Audio Processing
- Genetic Mapping and Diversity in Plants and Animals
- Bioinformatics and Genomic Networks
- Single-cell and spatial transcriptomics
- Genetic Associations and Epidemiology
- Olfactory and Sensory Function Studies
- Genomics and Chromatin Dynamics
- Gene expression and cancer classification
- CRISPR and Genetic Engineering
- ECG Monitoring and Analysis
- Genetics and Plant Breeding
- Phonocardiography and Auscultation Techniques
- Cell Image Analysis Techniques
- Diverse Approaches in Healthcare and Education Studies
- Technology and Data Analysis
- Genetic and phenotypic traits in livestock
- Educational Systems and Policies
- RNA and protein synthesis mechanisms
- RNA Research and Splicing
- Data Visualization and Analytics
- Nasal Surgery and Airway Studies
- Leaf Properties and Growth Measurement
- Advanced Chemical Sensor Technologies
Chung-Ang University
2020-2025
University of Wisconsin–Madison
2010-2022
Minneapolis Heart Institute Foundation
2017-2019
University of Minnesota
2015-2019
Samsung (South Korea)
2017-2019
The single cell RNA sequencing (scRNA-seq) technique begin a new era by allowing the observation of gene expression at level. However, there is also large amount technical and biological noise. Because low number transcriptomes stochastic nature pattern, high chance missing nonzero entries as zero, which are called dropout events. We develop DrImpute to impute events in scRNA-seq data. show that has significantly better performance on separation zeros from true than existing imputation...
Abstract Automated image acquisition, a custom analysis algorithm, and distributed computing resource were used to add time as third dimension quantitative trait locus (QTL) map for plant root gravitropism, model growth response an environmental cue. Digital images of Arabidopsis thaliana seedling roots from two independently reared sets 162 recombinant inbred lines (RILs) one set 92 near isogenic (NILs) derived Cape Verde Islands (Cvi) × Landsberg erecta (Ler) cross collected automatically...
The recent advent of CRISPR and other molecular tools enabled the reconstruction cell lineages based on induced DNA mutations promises to solve ones more complex organisms. To date, no lineage algorithms have been rigorously examined for their performance robustness across dataset types number cells. benchmark such methods, we decided organize a DREAM challenge using in vitro experimental intMEMOIR recordings silico data C. elegans tree about 1,000 cells Mus musculus 10,000 Some 22...
Abstract Background: Gene- and pathway-based analyses offer a useful alternative complement to the usual single SNP-based analysis for GWAS. On other hand, most existing gene- tests are not highly adaptive, and/or require availability of individual-level genotype phenotype data. It would be desirable have adaptive applicable summary statistics SNPs. This has become increasingly important given popularity large-scale meta-analyses multiple GWASs practical either GWAS or meta-analyzed Results:...
To identify novel genetic variants associated with complex traits and to shed new insights on underlying biology, in addition the most popular single SNP-single trait association analysis, it would be useful explore multiple correlated (intermediate) at gene- or pathway-level by mining existing GWAS meta-analyzed data. For this purpose, we present an adaptive gene-based test a pathway-based for analysis of summary statistics. The proposed tests are both SNP- trait-levels; that is, they...
The breakthrough high-throughput measurement of the cis-regulatory activity millions randomly generated promoters provides an unprecedented opportunity to systematically decode logic that determines expression values. We developed end-to-end transformer encoder architecture named Proformer predict values from DNA sequences. used a Macaron-like Transformer architecture, where two half-step feed forward (FFN) layers were placed at beginning and end each block, separable 1D convolution layer...
Chromatin looping allows enhancer-bound regulatory factors to influence transcription. Large domains, referred as topologically associated participate in genome organization. However, the mechanisms underlining interactions within these which control gene expression, are not fully understood. Here we report that activation of embryonic myogenesis is with establishment long-range chromatin centered on Pax3-bound loci. Using mass spectrometry and genomic studies, identify ubiquitously...
Big data have revolutionized the way are processed and used across all fields. In past, research was primarily conducted with a focus on hypothesis confirmation using sample data. However, in era of big data, this has shifted to gaining insights from collected Visualizing vast amounts derive is crucial. For instance, leveraging for visualization can help identify predict characteristics patterns related various infectious diseases. When presented visual format, within become clear, making it...
Abstract Most statistical methods for quantitative trait loci (QTL) mapping focus on a single phenotype. However, multiple phenotypes are commonly measured, and recent technological advances have greatly simplified the automated acquisition of numerous phenotypes, including function-valued such as growth measured over time. While exist QTL with they generally computationally intensive single-QTL models. We propose two simple, fast that maintain high power precision amenable to extensions...
Abstract We previously proposed a simple regression-based method to map quantitative trait loci underlying function-valued phenotypes. In order better handle the case of noisy phenotype measurements and accommodate correlation structure among time points, we propose an alternative approach that maintains much simplicity speed method. overcome by replacing observed data with smooth approximation. then apply functional principal component analysis, smoothed small number components....
DCLEAR is an R package used for single cell lineage reconstruction. The advances of CRISPR-based gene editing technologies have enabled the prediction trees based on observed edited barcodes from each cell. However, performance existing reconstruction methods was not accessed until recently. In response to this problem, Allen Institute hosted Cell Lineage Reconstruction Dream Challenge in 2020 crowdsource relevant knowledge across world. Our team won sub-challenges 2 and 3 challenge...
This paper focuses on the transition of automatic speaker verification systems from time delay neural networks (TDNN) to ResNet-based networks. TDNN-based use a statistics pooling layer aggregate temporal information which is suitable for two-dimensional tensors. Even though models produce three-dimensional tensors, they continue incorporate layer. However, reduction in spatial dimensions ResNet due convolution operations, including axis, raises concerns about loss and its compatibility with...
The transcriptional mechanisms driving lineage specification during development are still largely unknown, as the interplay of multiple transcription factors makes it difficult to dissect these molecular events. Using a cell-based differentiation platform probe function, we investigated role key paraxial mesoderm and skeletal myogenic commitment factors—mesogenin 1 (Msgn1), T-box 6 (Tbx6), forkhead box C1 (Foxc1), paired 3 (Pax3), Paraxis, mesenchyme homeobox (Meox1), sine oculis-related...
Acoustic scene classification (ASC) categorizes an audio file based on the environment in which it has been recorded. This long studied detection and of acoustic scenes events (DCASE). presents solution to Task 1 DCASE 2020 challenge submitted by Chung-Ang University team. addressed two challenges that ASC faces real-world applications. One is recorded using different recording devices should be classified general, other model used have low-complexity. We proposed models overcome...
Neural networks have emerged as immensely powerful tools in predicting functional genomic regions, notably evidenced by recent successes deciphering gene regulatory logic. However, a systematic evaluation of how model architectures and training strategies impact genomics performance is lacking. To address this gap, we held DREAM Challenge where competitors trained models on dataset millions random promoter DNA sequences corresponding expression levels, experimentally determined yeast, to...
Researchers are employing deep learning (DL) in many fields, and the scope of its application is expanding. However, because understanding rationale validity DL decisions difficult, a model occasionally called black-box model. Here, we focus on DL-based explainable time-series prediction We propose based long short-term memory (LSTM) followed by convolutional neural network (CNN) with residual connection, referred to as LSTM-resCNN. In comparison one-dimensional CNN, bidirectional LSTM,...
The goal of the "2019 Automatic Speaker Verification Spoofing and Countermeasures Challenge" (ASVspoof) was to make it easier create systems that could identify voice spoofing attacks with high levels accuracy. However, model complexity latency requirements were not emphasized in competition, despite fact they are stringent for implementation real world. majority top-performing solutions from competition used an ensemble technique merged numerous sophisticated deep learning models maximize...
Conventional spoofing detection systems have heavily relied on the use of handcrafted features derived from speech data. However, a notable shift has recently emerged towards direct utilization raw waveforms, as demonstrated by methods like SincNet filters. This underscores demand for more sophisticated audio sample features. Moreover, success deep learning models, particularly those utilizing large pretrained wav2vec 2.0 featurization front-end, highlights importance refined feature...
Abstract A systematic evaluation of how model architectures and training strategies impact genomics performance is needed. To address this gap, we held a DREAM Challenge where competitors trained models on dataset millions random promoter DNA sequences corresponding expression levels, experimentally determined in yeast. For robust the models, designed comprehensive suite benchmarks encompassing various sequence types. All top-performing used neural networks but diverged strategies. dissect...