- Single-cell and spatial transcriptomics
- Epigenetics and DNA Methylation
- Genomics and Chromatin Dynamics
- Gene Regulatory Network Analysis
- Renal and related cancers
- Birth, Development, and Health
- Genomics and Rare Diseases
- RNA Research and Splicing
- Gene expression and cancer classification
- Genetic and Kidney Cyst Diseases
- T-cell and B-cell Immunology
- Genetics, Bioinformatics, and Biomedical Research
- Immune Cell Function and Interaction
- Genomic variations and chromosomal abnormalities
- Cancer-related molecular mechanisms research
- Genetic Associations and Epidemiology
- Evolution and Genetic Dynamics
- Renal Diseases and Glomerulopathies
- Cell Image Analysis Techniques
- Genomics and Phylogenetic Studies
University of California, Berkeley
2022-2024
Berkeley College
2024
University of California, San Francisco
2023
Columbia University
2016-2018
Memorial Sloan Kettering Cancer Center
2018
Genomic deep learning models can predict genome-wide epigenetic features and gene expression levels directly from DNA sequence. While current perform well at predicting across genes in different cell types the reference genome, their ability to explain variation between individuals due cis-regulatory genetic variants remains largely unexplored. Here, we evaluate four state-of-the-art on paired personal genome transcriptome data find limited performance when explaining individuals. In...
Abstract Genomic deep learning models can predict genome-wide epigenetic features and gene expression levels directly from DNA sequence. While current perform well at predicting across genes in different cell types the reference genome, their ability to explain variation between individuals due cis-regulatory genetic variants remains largely unexplored. Here we evaluate four state-of-the-art on paired personal genome transcriptome data find limited performance when explaining individuals.
Abstract Background A number of deep learning models have been developed to predict epigenetic features such as chromatin accessibility from DNA sequence. Model evaluations commonly report performance genome-wide; however, cis regulatory elements (CREs), which play critical roles in gene regulation, make up only a small fraction the genome. Furthermore, cell type-specific CREs contain large proportion complex disease heritability. Results We evaluate genomic regions with varying degrees type...
Single-cell RNA-sequencing is revolutionizing biological discovery. However, scRNA-seq technologies suffer from many sources of significant technical noise, the most prominent being undersampling mRNA molecules, often termed 'dropout'. Dropout can severely obscure important gene-gene relationships and impedes possibility learning gene regulatory networks at single cell resolution. To address this, we developed MAGIC (Markov Affinity-based Graph Imputation Cells), a computational approach...
Kidney disease is highly heritable; however, the causal genetic variants, cell types in which these variants function, and molecular mechanisms underlying kidney remain largely unknown. To identify loci affecting we performed a GWAS using multiple function biomarkers identified 462 loci. begin to investigate how affect generated single-cell chromatin accessibility (scATAC-seq) maps of human candidate
Abstract Multiplexed single-cell sequencing (mux-seq) using single-nucleotide polymorphisms (SNPs) has emerged as an efficient approach to perform expression quantitative trait loci (eQTL) studies that map interactions between genetic variants and cell types, states, or experimental perturbations. Here we introduce the clue framework, a novel encode mux-seq experiments eliminates need for reference genotypes barcoding. The framework is made possible by development of freemuxlet , algorithm...
Abstract Gene expression levels can vary substantially across cells, even in a seemingly homogeneous cell population. Identifying the relationships between genetic variation and gene is critical for understanding mechanisms of genome regulation. However, control variability among cells within individuals has yet to be extensively examined. This primarily due statistical challenges, such as need sufficiently powered cohorts adjusting mean-variance dependence. Here, we introduce MEOTIVE...
Background: A number of deep learning models have been developed to predict epigenetic features such as chromatin accessibility from DNA sequence. Model evaluations commonly report performance genome-wide; however, cis regulatory elements (CREs), which play critical roles in gene regulation, make up only a small fraction the genome. Furthermore, cell type specific CREs contain large proportion complex disease heritability. Results: We evaluate genomic regions with varying degrees...
The majority of genetic variants identified in genome-wide association studies complex traits are non-coding, and characterizing their function remains an important challenge human genetics. Genomic deep learning models have emerged as a promising approach to enable silico prediction variant effects. These include supervised sequence-to-activity models, which predict chromatin states or gene expression levels directly from DNA sequence, self-supervised genomic language models. Here, we...
Abstract Genomic sequence-to-activity models are increasingly utilized to understand gene regulatory syntax and probe the functional consequences of variation. Current make accurate predictions relative activity levels across human reference genome, but their performance is more limited for predicting effects genetic variants, such as explaining expression variation individuals. To better causes these shortcomings, we examine uncertainty in genomic using an ensemble Basenji2 model...