NFDI4DS | UHH-SEMS - Publication Details

Comparative analysis of metazoan chromatin organization

OPENALEX - Publications

Joshua W. K. Ho Youngsook L. Jung Tao Liu B. Alver Soohyun Lee and 73 more

A large collection of new modENCODE and ENCODE genome-wide chromatin data sets from cell lines developmental stages in worm, fly human are analysed; this reveals many conserved features organization among the three organisms, as well notable differences composition locations repressive chromatin. This study describes numerous Homo sapiens, Drosophila melanogaster Caenorhabditis elegans generated by consortia. The results point to while identifying Genome function is dynamically regulated...

10.1038/nature13415 article EN cc-by-nc-sa Nature 2014-08-26

PREDICTD PaRallel Epigenomics Data Imputation with Cloud-based Tensor Decomposition

OPENALEX - Publications

Timothy Durham Maxwell W. Libbrecht James Jeffry Howbert Jeff Bilmes William Stafford Noble

Abstract The Encyclopedia of DNA Elements (ENCODE) and the Roadmap Epigenomics Project seek to characterize epigenome in diverse cell types using assays that identify, for example, genomic regions with modified histones or accessible chromatin. These efforts have produced thousands datasets but cannot possibly measure each epigenomic factor all types. To address this, we present a method, PaRallel Data Imputation Cloud-based Tensor Decomposition (PREDICTD), computationally impute missing...

10.1038/s41467-018-03635-9 article EN cc-by Nature Communications 2018-04-11

Joint annotation of chromatin state and chromatin conformation reveals relationships among domain types and identifies domains of cell-type-specific expression

OPENALEX - Publications

Maxwell W. Libbrecht Ferhat Ay Michael M. Hoffman David M. Gilbert Jeffrey A. Bilmes and 1 more

The genomic neighborhood of a gene influences its activity, behavior that is attributable in part to domain-scale regulation. Previous studies have identified many types regulatory domains. However, due the difficulty integrating genomics data sets, relationships among these domain are poorly understood. Semi-automated genome annotation (SAGA) algorithms facilitate human interpretation heterogeneous collections by simultaneously partitioning and assigning labels resulting segments. existing...

10.1101/gr.184341.114 article EN cc-by-nc Genome Research 2015-02-12

Nucleotide sequence and DNaseI sensitivity are predictive of 3D chromatin architecture

OPENALEX - Publications

Jacob Schreiber Maxwell W. Libbrecht Jeffrey A. Bilmes William Stafford Noble

Abstract Recently, Hi-C has been used to probe the 3D chromatin architecture of multiple organisms and cell types. The resulting collections pairwise contacts across genome have connected many cellular phenomena, including replication timing gene regulation. However, high resolution (10 kb or finer) contact maps remain scarce due expense time required for collection. A computational method predicting without need run a experiment would be invaluable in understanding role that plays biology....

10.1101/103614 preprint EN cc-by-nc bioRxiv (Cold Spring Harbor Laboratory) 2017-01-27

Robust chromatin state annotation

OPENALEX - Publications

Mehdi Foroozandeh Shahraki Marjan Farahbod Maxwell W. Libbrecht

With the goal of mapping genomic activity, international projects have recently measured epigenetic activity in hundreds cell and tissue types. Chromatin state annotations produced by segmentation genome annotation (SAGA) methods emerged as predominant way to summarize these epigenomic data sets order annotate genome. These chromatin are essential for many tasks, including identifying active regulatory elements interpreting disease-associated genetic variation. However, despite widespread...

10.1101/gr.278343.123 article EN cc-by-nc Genome Research 2024-03-21

Distinct epigenetic features of differentiation-regulated replication origins

OPENALEX - Publications

Owen K. Smith RyanGuk Kim Haiqing Fu Melvenia M. Martin Chii M. Lin and 10 more

Eukaryotic genome duplication starts at discrete sequences (replication origins) that coordinate cell cycle progression, ensure genomic stability and modulate gene expression. Origins share some sequence features, but their activity also responds to changes in transcription cellular differentiation status. To identify chromatin states histone modifications locally mark replication origins, we profiled origin distributions eight human lines representing embryonic differentiated types....

10.1186/s13072-016-0067-3 article EN cc-by Epigenetics & Chromatin 2016-05-10

A unified encyclopedia of human functional DNA elements through fully automated annotation of 164 human cell types

OPENALEX - Publications

Maxwell W. Libbrecht Oscar L. Rodriguez Zhiping Weng Jeffrey A. Bilmes Michael M. Hoffman and 1 more

Semi-automated genome annotation methods such as Segway take input a set of genome-wide measurements histone modification or DNA accessibility and output an genomic activity in the target cell type. Here we present annotations 164 human types using 1615 data sets. To produce these annotations, automated label interpretation step to fully strategy. Using developed measure importance each position called "conservation-associated score." We further combined all into single, type-agnostic...

10.1186/s13059-019-1784-2 article EN cc-by Genome biology 2019-08-28

CANDI: self-supervised, confidence-aware denoising imputation of genomic data

OPENALEX - Publications

Mehdi Foroozandeh Shahraki Abdul Rahman Diab Maxwell W. Libbrecht

Large-scale epigenomic datasets such as histone modifications and DNA accessibility have greatly advanced our understanding of genomic function. However, these measurements often suffer from noise, batch effects irreproducibility. Epigenome imputation has emerged a promising solution to challenges. These methods integrate patterns across experiments, cell types, loci predict the results yielding predictions that surpass observed data in quality. Thus, researchers increasingly leverage for...

10.1101/2025.01.23.634626 preprint EN cc-by-nd bioRxiv (Cold Spring Harbor Laboratory) 2025-01-25

Pan-cell type continuous chromatin state annotation of all IHEC epigenomes

OPENALEX - Publications

Habib Daneshpajouh Ismail Moghul Kay C. Wiese Maxwell W. Libbrecht

Understanding the mechanistic basis of genetic disease requires annotating regulatory elements in human genome. To this end, International Human Epigenome Consortium (IHEC) has generated thousands epigenomic datasets--including ChIP-seq, DNase-seq, and ATAC-seq--that measure various biochemical activities genome, including transcription factor binding, histone modification, DNA accessibility. Currently, predominant methods for integrating these data sets to annotate are segmentation genome...

10.1101/2025.02.06.636950 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2025-02-08

Segway 2.0: Gaussian mixture models and minibatch training

OPENALEX - Publications

Rachel C.W. Chan Maxwell W. Libbrecht Eric G. Roberts Jeffrey A. Bilmes William Stafford Noble and 1 more

Segway performs semi-automated genome annotation, discovering joint patterns across multiple genomic signal datasets. We discuss a major new version of and highlight its ability to model data with substantially greater accuracy. Major enhancements in 2.0 include the mixture Gaussians, enabling capture arbitrarily complex distributions, minibatch training, leading better learned parameters.Segway source code are freely available for download at http://segway.hoffmanlab.org. have made scripts...

10.1093/bioinformatics/btx603 article EN cc-by Bioinformatics 2017-09-20

SplitStrains, a tool to identify and separate mixed Mycobacterium tuberculosis infections from WGS data

OPENALEX - Publications

Einar Gabbassov Miguel Moreno-Molina Iñaki Comas Maxwell W. Libbrecht Leonid Chindelevitch

The occurrence of multiple strains a bacterial pathogen such as M. tuberculosis or C. difficile within single human host, referred to mixed infection, has important implications for both healthcare and public health. However, methods detecting it, especially determining the proportion identities underlying strains, from WGS (whole-genome sequencing) data, have been limited. In this paper we introduce SplitStrains, novel method addressing these challenges. Grounded in rigorous statistical...

10.1099/mgen.0.000607 article EN cc-by Microbial Genomics 2021-03-24

Choosing panels of genomics assays using submodular optimization

OPENALEX - Publications

Kai Wei Maxwell W. Libbrecht Jeffrey A. Bilmes William Stafford Noble

Due to the high cost of sequencing-based genomics assays such as ChIP-seq and DNase-seq, epigenomic characterization a cell type is typically carried out using small panel assay types. Deciding priori which perform is, thus, critical step in many studies. We present submodular selection (SSA), method for choosing diverse genomic that leverages methods from optimization. More generally, this application serves model how optimization can be applied other discrete problems biology.

10.1186/s13059-016-1089-7 article EN cc-by Genome biology 2016-11-15

Choosing non‐redundant representative subsets of protein sequence data sets using submodular optimization

OPENALEX - Publications

Maxwell W. Libbrecht Jeffrey A. Bilmes William Stafford Noble

Abstract Selecting a non‐redundant representative subset of sequences is common step in many bioinformatics workflows, such as the creation training sets for sequence and structural models or selection “operational taxonomic units” from metagenomics data. Previous methods this task, CD‐HIT, PISCES, UCLUST, apply heuristic threshold‐based algorithm that has no theoretical guarantees. We propose new approach based on submodular optimization. Submodular optimization, discrete analogue to...

10.1002/prot.25461 article EN Proteins Structure Function and Bioinformatics 2018-01-18

Learning representations of chromatin contacts using a recurrent neural network identifies genomic drivers of conformation

OPENALEX - Publications

Kevin B. Dsouza Alexandra Maslova Ediem Al-Jibury Matthias Merkenschlager Vijay K. Bhargava and 1 more

Despite the availability of chromatin conformation capture experiments, discerning relationship between 1D genome and 3D remains a challenge, which limits our understanding their affect on gene expression disease. We propose Hi-C-LSTM, method that produces low-dimensional latent representations summarize intra-chromosomal Hi-C contacts via recurrent long short-term memory neural network model. find these contain all information needed to recreate observed matrix with high accuracy,...

10.1038/s41467-022-31337-w article EN cc-by Nature Communications 2022-06-28

Interferometric measurement of the resonant absorption and refractive index in rubidium gas

OPENALEX - Publications

Kenneth G. Libbrecht Maxwell W. Libbrecht

We present a laboratory demonstration of the Kramers-Kronig relation between resonant absorption and refractive index in rubidium gas. Our experiment uses vapor cell one arm simple Mach-Zehnder interferometer. As laser frequency is scanned over an atomic resonance, interferometer output affected by variations both gas with frequency, all which can be calculated straightforward manner. Changing density phase produces family different signals. The was performed using commercially available...

10.1119/1.2335476 article EN American Journal of Physics 2006-11-22

Continuous chromatin state feature annotation of the human epigenome

OPENALEX - Publications

Habib Daneshpajouh Bowen Chen Neda Shokraneh Shohre Masoumi Kay C. Wiese and 1 more

Segmentation and genome annotation (SAGA) algorithms are widely used to understand activity gene regulation. These methods take as input a set of sequencing-based assays epigenomic activity, such ChIP-seq measurements histone modification transcription factor binding. They output an the that assigns chromatin state label each genomic position. Existing SAGA have several limitations caused by discrete framework: annotations cannot easily represent varying strengths elements, they...

10.1093/bioinformatics/btac283 article EN cc-by Bioinformatics 2022-04-19

DEEMD: Drug Efficacy Estimation Against SARS-CoV-2 Based on Cell Morphology With Deep Multiple Instance Learning

OPENALEX - Publications

Mohammadsadegh Saberian Kathleen P. Moriarty Andrea Olmstead Christian Hallgrimson François Jean and 3 more

Drug repurposing can accelerate the identification of effective compounds for clinical use against SARS-CoV-2, with advantage pre-existing safety data and an established supply chain. RNA viruses such as SARS-CoV-2 manipulate cellular pathways induce reorganization subcellular structures to support their life cycle. These morphological changes be quantified using bioimaging techniques. In this work, we developed DEEMD: a computational pipeline deep neural network models within multiple...

10.1109/tmi.2022.3178523 article EN IEEE Transactions on Medical Imaging 2022-05-27

INGOT-DR: an interpretable classifier for predicting drug resistance in M. tuberculosis

OPENALEX - Publications

Hooman Zabeti Nick Dexter Amir Safari Nafiseh Sedaghat Maxwell W. Libbrecht and 1 more

Prediction of drug resistance and identification its mechanisms in bacteria such as Mycobacterium tuberculosis, the etiological agent is a challenging problem. Solving this problem requires transparent, accurate, flexible predictive model. The methods currently used for purpose rarely satisfy all these criteria. On one hand, approaches based on testing strains against catalogue previously identified mutations often yield poor performance; other machine learning techniques typically have...

10.1186/s13015-021-00198-1 article EN cc-by Algorithms for Molecular Biology 2021-08-10

A unified encyclopedia of human functional DNA elements through fully automated annotation of 164 human cell types

OPENALEX - Publications

Maxwell W. Libbrecht Oscar L. Rodriguez Zhiping Weng Jeffrey A. Bilmes Michael M. Hoffman and 1 more

Abstract Semi-automated genome annotation methods such as Segway enable understanding of chromatin activity. Here we present state annotations 164 human cell types using 1,615 genomics data sets. To produce these annotations, developed a fully-automated strategy in which train separate unsupervised models on each type and use machine learning classifier to automate the interpretation step. Using measure importance genomic position called “conservation-associated activity score,” aggregate...

10.1101/086025 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2016-11-07

Continuous chromatin state feature annotation of the human epigenome

OPENALEX - Publications

Bowen Chen Neda Shokraneh Kenari Maxwell W. Libbrecht

Abstract Semi-automated genome annotation (SAGA) methods are widely used to understand activity and gene regulation. These take as input a set of sequencing-based assays epigenomic (such ChIP-seq measurements histone modification transcription factor binding), output an the that assigns chromatin state label each genomic position. Existing SAGA have several limitations caused by discrete framework: such annotations cannot easily represent varying strengths elements, they combinatorial...

10.1101/473017 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2018-11-18

DEEMD: Drug Efficacy Estimation against SARS-CoV-2 based on cell Morphology with Deep multiple instance learning

OPENALEX - Publications

Mohammadsadegh Saberian Kathleen Moriarty Andrea Olmstead Christian Hallgrimson François Jean and 3 more

Drug repurposing can accelerate the identification of effective compounds for clinical use against SARS-CoV-2, with advantage pre-existing safety data and an established supply chain. RNA viruses such as SARS-CoV-2 manipulate cellular pathways induce reorganization subcellular structures to support their life cycle. These morphological changes be quantified using bioimaging techniques. In this work, we developed DEEMD: a computational pipeline deep neural network models within multiple...

10.36227/techrxiv.19326665.v1 article EN cc-by 2022-03-15

INGOT-DR: an interpretable classifier for predicting drug resistance in M. tuberculosis

OPENALEX - Publications

Hooman Zabeti Nick Dexter Amir Safari Nafiseh Sedaghat Maxwell W. Libbrecht and 1 more

Abstract Motivation Prediction of drug resistance and identification its mechanisms in bacteria such as Mycobacterium tuberculosis , the etiological agent tuberculosis, is a challenging problem. Solving this problem requires transparent, accurate, flexible predictive model. The methods currently used for purpose rarely satisfy all these criteria. On one hand, approaches based on testing strains against catalogue previously identified mutations often yield poor performance; other machine...

10.1101/2020.05.31.115741 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2020-05-31