- Single-cell and spatial transcriptomics
- Genomics and Chromatin Dynamics
- Gene expression and cancer classification
- Bioinformatics and Genomic Networks
- Epigenetics and DNA Methylation
- Cancer Genomics and Diagnostics
- RNA modifications and cancer
- Gene Regulatory Network Analysis
- Immune Cell Function and Interaction
- T-cell and B-cell Immunology
- Genetic Associations and Epidemiology
- RNA Research and Splicing
- Genetic Mapping and Diversity in Plants and Animals
- interferon and immune responses
- Genetic Syndromes and Imprinting
- Cancer-related molecular mechanisms research
- CAR-T cell therapy research
- CRISPR and Genetic Engineering
- Pluripotent Stem Cells Research
- Neuroinflammation and Neurodegeneration Mechanisms
- Congenital heart defects research
- SARS-CoV-2 and COVID-19 Research
- Lung Cancer Research Studies
- Genomics and Rare Diseases
- High Altitude and Hypoxia
Clemson University
2020-2024
Center for Human Genetics
2020-2024
Stanford University
2017-2022
Greenwood Genetic Center
2021
Pennsylvania State University
2020
National Center for Mathematics and Interdisciplinary Sciences
2012-2019
Academy of Mathematics and Systems Science
2012-2019
Chinese Academy of Sciences
2012-2019
University of Chinese Academy of Sciences
2017
Significance Chromatin plays a critical role in the regulation of gene expression. Interactions among chromatin regulators, sequence-specific transcription factors, and cis -regulatory sequence elements are main driving forces shaping context-specific structure However, because large number such interactions, direct data on them often missing most cellular contexts. The purpose present work is to show that, by modeling matched expression accessibility across diverse contexts, it possible...
Significance Biological samples are often heterogeneous mixtures of different types cells. Suppose we have two single-cell datasets, each providing information on a cellular feature and generated sample from this mixture. Then, the clustering cells in should be coupled as both clusterings reflecting underlying cell same This “coupled clustering” problem is new not covered by existing methods. In paper, develop an approach for its solution based coupling nonnegative matrix factorizations. The...
Abstract Existing methods for gene regulatory network (GRN) inference rely on expression data alone or lower resolution bulk data. Despite the recent integration of chromatin accessibility and RNA sequencing data, learning complex mechanisms from limited independent points still presents a daunting challenge. Here we present LINGER (Lifelong neural regulation), machine-learning method to infer GRNs single-cell paired incorporates atlas-scale external across diverse cellular contexts prior...
Characterizing epigenetic heterogeneity at the cellular level is a critical problem in modern genomics era. Assays such as single cell ATAC-seq (scATAC-seq) offer an opportunity to interrogate through patterns of variability open chromatin. However, these assays exhibit technical that complicates clear classification and type identification heterogeneous populations. We present scABC, R package for unsupervised clustering single-cell data, classify scATAC-seq data discover regions chromatin...
In both Turner syndrome (TS) and Klinefelter (KS) copy number aberrations of the X chromosome lead to various developmental symptoms. We report a comparative analysis TS vs. KS regarding differences at genomic network level measured in primary samples by analyzing gene expression, DNA methylation, chromatin conformation. X-chromosome inactivation (XCI) silences transcription from one female mammals, on which most genes are inactive, some escape XCI. TS, almost all differentially expressed...
Abstract High-altitude adaptation of Tibetans represents a remarkable case natural selection during recent human evolution. Previous genome-wide scans found many non-coding variants under selection, suggesting pressing need to understand the functional role regulatory elements (REs). Here, we generate time courses paired ATAC-seq and RNA-seq data on cultured HUVECs hypoxic normoxic conditions. We further develop variant interpretation methodology (vPECA) identify active selected REs (ASREs)...
Characterizing and interpreting heterogeneous mixtures at the cellular level is a critical problem in genomics. Single-cell assays offer an opportunity to resolve heterogeneity, e.g., scRNA-seq enables single-cell expression profiling, scATAC-seq identifies active regulatory elements. Furthermore, while scHi-C can measure chromatin contacts (i.e., loops) between elements target genes single cells, bulk HiChIP such higher resolution. In this work, we introduce DC3 (De-Convolution...
Significance T cell exhaustion is a major barrier to cancer immunotherapy. the state of dysfunction after chronic stimulation, and recent studies indicate that epigenetically controlled associated with unique chromatin profiles. This work reports genome-wide map active DNA regulatory elements their connection genes in time course human chimeric antigen receptor exhaustion. Early events establish later program gene expression, are often human-specific, exhaustion-associated enhancers can be...
Abstract Technological development has enabled the profiling of gene expression and chromatin accessibility from same cell. We develop scREG, a dimension reduction methodology, based on concept cis -regulatory potential, for single cell multiome data. This is further used construction subpopulation-specific networks. The capability inferring useful regulatory network demonstrated by two-fold increment inference accuracy compared to Pearson correlation-based method 27-fold enrichment GWAS...
A time course experiment is a widely used design in the study of cellular processes such as differentiation or response to stimuli. In this paper, we propose reg ulatory analysis (TimeReg) method for gene regulatory networks based on paired expression and chromatin accessibility data from course. TimeReg can be prioritize elements, extract core modules at each point, identify key regulators driving changes state, causally connect across different points. We applied analyze retinoic acid...
Abstract Latest advancements in the high-throughput single-cell genome (scDNA) and transcriptome (scRNA) sequencing technologies enabled cell-resolved investigation of tissue clones. However, it remains challenging to cluster couple single cells for heterogeneous scRNA scDNA data generated from same specimen. In this study, we present a computational framework called CCNMF, which employs novel Coupled-Clone Non-negative Matrix Factorization technique jointly infer clonal structure matched...
Genome-wide association studies (GWAS) have cataloged many significant associations between genetic variants and complex traits. However, most of these findings unclear biological significance, because they often small effects occur in non-coding regions. Integration GWAS with gene regulatory networks addresses both issues by aggregating weak signals within programs. Here we develop a Bayesian framework that integrates summary statistics to infer enrichments simultaneously. Our method...
Abstract The comparison of gene regulatory networks between diseased versus healthy individuals or two different treatments is an important scientific problem. Here, we propose sc-compReg as a method for the comparative analysis expression conditions using single cell (scRNA-seq) and chromatin accessibility data (scATAC-seq). Our software, sc-compReg, can be used stand-alone package that provides joint clustering embedding cells from both scRNA-seq scATAC-seq, construction differential...
Significance Here we use the expression and accessibility data from a diverse set of cell types to learn model for dependence regulatory element on its DNA sequence TF expression. Using GTEx samples with WGS data, show that noncoding variants predicted affect are more strongly associated nearby genes. To interpret personal genome, combine information context-specific prioritize elements in any genomic region interest. This approach should be helpful study risk loci previously identified by...
Abstract Cranial Neural Crest Cells (CNCC) originate at the cephalic region from forebrain, midbrain and hindbrain, migrate into developing craniofacial region, subsequently differentiate multiple cell types. The entire specification, delamination, migration, differentiation process is highly regulated abnormalities during this development cause birth defects. To better understand molecular networks underlying CNCC, we integrate paired gene expression & chromatin accessibility data...
Despite recent developments, it is hard to profile all multi-omics single-cell data modalities on the same cell. Thus, huge amounts of genomics unpaired observations different cells are generated. We propose a method named UnpairReg for regression analysis integrate data. On real and simulated data, provides an accurate estimation cell gene expression where only chromatin accessibility available. The cis-regulatory network inferred from highly consistent with eQTL mapping. improves type...
Abstract Accurate context-specific Gene Regulatory Networks (GRNs) inference from genomics data is a crucial task in computational biology. However, existing methods face limitations, such as reliance on gene expression alone, lower resolution bulk data, and scarcity for specific cellular systems. Despite recent technological advancements, including single-cell sequencing the integration of ATAC-seq RNA-seq learning complex mechanisms limited independent points still presents daunting...
Abstract Chromatin regulators (CRs) are crucial for connecting the chromatin level and transcriptome by modulating structures, establishing maintaining epigenetic modifications. We present a systematic method to identify MOdulation of transcriptional regulation via CHromatin Activity (MOCHA) from gene expression data demonstrate its advantage in associating CRs their localization understand CRs’ function. first re-construct modulation network integrating correlation conditional concepts....
Abstract When different types of functional genomics data are generated on single cells from samples the same heterogeneous population, clustering in should be coupled. We formulate this “coupled clustering” problem as an optimization problem, and propose method coupled nonnegative matrix factorizations (coupled NMF) for its solution. The is illustrated by integrative analysis cell RNA-seq ATAC-seq data. Significance Statements Biological often mixtures cells. Suppose we have two sets, each...
Abstract Alcohol use disorder (AUD) induces complex transcriptional and regulatory changes across multiple brain regions including the caudate nucleus, which remains understudied. Using paired single-nucleus RNA-seq ATAC-seq on samples from 143 human postmortem brains, 74 with AUD, we identified 17 distinct cell types. We found that a significant portion of alcohol-induced in gene expression occurred through altered chromatin accessibility. Notably, novel accessibility differences medium...
Abstract Characterizing epigenetic heterogeneity at the cellular level is a critical problem in modern genomics era. Assays such as single cell ATAC-seq (scATAC-seq) offer an opportunity to interrogate through patterns of variability open chromatin. However, these assays exhibit technical that complicates clear classification and type identification heterogeneous populations. We present scABC, R package for unsupervised clustering data, classify scATAC-seq data discover regions chromatin...
Abstract Transcription factors (TFs) and transcriptional coregulators represent an emerging class of therapeutic targets in oncology. Gene regulatory networks (GRNs) can be used to evaluate pharmacological agents targeting these identify drivers disease drug resistance. However, GRN methods that rely solely on gene expression often fail account for post-transcriptional modulation TF function. We present Epiregulon, a method constructs GRNs from single-cell ATAC-seq RNA-seq data accurate...
Latest advancements in high-throughput single-cell genome (scDNA) and transcriptome (scRNA) sequencing technologies enabled cell-resolved investigation of tissue clones. However, it remains challenging to cluster couple single cells for heterogeneous scRNA scDNA data generated from the same specimen. In this study, we present a computational framework called CC-NMF, which employs novel Coupled-Clone Non-negative Matrix Factorization technique jointly infer clonal structure matched data....