De novo identification of replication-timing domains in the human genome by deep learning

Replication timing Replication Identification
DOI: 10.1093/bioinformatics/btv643 Publication Date: 2015-11-07T02:04:32Z
ABSTRACT
Abstract Motivation: The de novo identification of the initiation and termination zones—regions that replicate earlier or later than their upstream downstream neighbours, respectively—remains a key challenge in DNA replication. Results: Building on advances deep learning, we developed novel hybrid architecture combining pre-trained, neural network hidden Markov model (DNN-HMM) for replication domains using timing profiles. Our results demonstrate DNN-HMM can significantly outperform strong, discriminatively trained Gaussian mixture model–HMM (GMM-HMM) systems other six reported methods be applied to this challenge. We our identify distinct domain types, namely early (ERD), down transition zone (DTZ), late (LRD) up (UTZ), newly replicated sequencing (Repli-Seq) data across 15 human cells. A subsequent integrative analysis revealed these harbour unique genomic epigenetic patterns, transcriptional activity higher-order chromosomal structure. findings support ‘replication-domain’ model, which states (1) ERDs LRDs, connected by UTZs DTZs, are spatially compartmentalized structural functional units structure, (2) adjacent DTZ-UTZ pairs form chromatin loops (3) intra-interactions within LRDs tend short-range long-range, respectively. reveals an important organizational principle genome represents critical step towards understanding mechanisms regulating timing. Availability implementation: method three additional algorithms freely accessed at https://github.com/wenjiegroup/DNN-HMM. regions identified study available GEO under accession ID GSE53984. Contact: shuwj@bmi.ac.cn boxc@bmi.ac.cn Supplementary information: Bioinformatics online.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (39)
CITATIONS (45)