Clustering gene expression time series data using an infinite Gaussian process mixture model

0301 basic medicine Lung Neoplasms Time Factors QH301-705.5 Normal Distribution Models, Biological Dexamethasone Histones 03 medical and health sciences Cell Line, Tumor Cluster Analysis Humans Computer Simulation Biology (General) Glucocorticoids Oligonucleotide Array Sequence Analysis Sequence Analysis, RNA Gene Expression Profiling Computational Biology Hydrogen Bonding Hydrogen Peroxide Gene Expression Regulation, Neoplastic A549 Cells Algorithms Research Article
DOI: 10.1371/journal.pcbi.1005896 Publication Date: 2018-01-16T18:24:07Z
ABSTRACT
Transcriptome-wide time series expression profiling is used to characterize the cellular response environmental perturbations. The first step analyzing transcriptional data often cluster genes with similar responses. Here, we present a nonparametric model-based method, Dirichlet process Gaussian mixture model (DPGP), which jointly models clusters and temporal dependencies processes. We demonstrate accuracy of DPGP in comparison state-of-the-art approaches using hundreds simulated sets. To further test our apply published microarray from microbial organism exposed stress novel RNA-seq human cell line glucocorticoid dexamethasone. validate by examining local transcription factor binding histone modifications. Our results that modeling number can reveal shared regulatory mechanisms. software freely available online at https://github.com/PrincetonUniversity/DP_GP_cluster.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (89)
CITATIONS (144)