- RNA and protein synthesis mechanisms
- RNA modifications and cancer
- RNA Research and Splicing
- Cancer-related molecular mechanisms research
- Machine Learning in Bioinformatics
- Genomics and Chromatin Dynamics
- Lipid Membrane Structure and Behavior
- Force Microscopy Techniques and Applications
- Photoreceptor and optogenetics research
- Genomics and Phylogenetic Studies
- Bioinformatics and Genomic Networks
- Mechanical and Optical Resonators
- Gene expression and cancer classification
- Air Quality and Health Impacts
- Digestive system and related health
- Genetic Associations and Epidemiology
- Atmospheric chemistry and aerosols
- Legionella and Acanthamoeba research
- Biomedical Text Mining and Ontologies
- Advanced Graph Neural Networks
- Protein Structure and Dynamics
- MicroRNA in disease regulation
- RNA regulation and disease
- Climate Change and Health Impacts
- Genetic and Clinical Aspects of Sex Determination and Chromosomal Abnormalities
Helmholtz Zentrum München
2019-2024
Helmholtz Association of German Research Centres
2022-2023
Max Planck Institute for Molecular Genetics
2013-2023
Freie Universität Berlin
2016-2023
Max Planck Society
2018
TU Dresden
2006-2010
Background: Secondary organic aerosols (SOAs) formed from anthropogenic or biogenic gaseous precursors in the atmosphere substantially contribute to ambient fine particulate matter [PM ≤2.5μm aerodynamic diameter (PM2.5)] burden, which has been associated with adverse human health effects. However, there is only limited evidence on their differential toxicological impact. Objectives: We aimed discriminate effects of generated by atmospheric aging combustion soot particles (SPs) (β-pinene)...
The iCLIP and eCLIP techniques facilitate the detection of protein–RNA interaction sites at high resolution, based on diagnostic events crosslink sites. However, previous methods do not explicitly model specifics truncation patterns possible biases. We developed PureCLIP ( https://github.com/skrakau/PureCLIP ), a hidden Markov approach, which simultaneously performs peak-calling individual site detection. It incorporates non-specific background signal and, for first time, sequence On both...
Convolutional neural networks (CNNs) have been shown to perform exceptionally well in a variety of tasks, including biological sequence classification. Available implementations, however, are usually optimized for particular task and difficult reuse. To enable researchers utilize these more easily, we implemented pysster, Python package training CNNs on data. Sequences classified by learning structure motifs the offers an automated hyper-parameter optimization procedure options visualize...
Regulation of viral RNA biogenesis is fundamental to productive SARS-CoV-2 infection. To characterize host RNA-binding proteins (RBPs) involved in this process, we biochemically identified bound genomic and subgenomic RNAs. We find that the protein SND1 binds 5' end negative-sense required for synthesis. SND1-depleted cells form smaller replication organelles display diminished virus growth kinetics. discover NSP9, a RBP direct interaction partner, covalently linked ends positive- RNAs...
In recent years, hundreds of novel RNA-binding proteins (RBPs) have been identified, leading to the discovery domains. Furthermore, unstructured or disordered low-complexity regions RBPs identified play an important role in interactions with nucleic acids. However, these advances understanding are limited mainly eukaryotic species and we only tools faithfully predict RNA-binders bacteria. Here, describe a support vector machine-based method, called TriPepSVM, for prediction proteins....
To initiate X-Chromosome inactivation (XCI), the long noncoding RNA Xist mediates chromosome-wide gene silencing of one X Chromosome in female mammals to equalize dosage between sexes. The efficiency is highly variable across genes, with some genes even escaping XCI somatic cells. A gene's susceptibility Xist-mediated appears be determined by a complex interplay epigenetic and genomic features; however, underlying rules remain poorly understood. We have quantified kinetics at level nascent...
The health effects of exposure to secondary organic aerosols (SOAs) are still limited. Here, we investigated and compared the toxicities soot particles (SP) coated with β-pinene SOA (SOAβPin-SP) SP naphthalene (SOANap-SP) in a human bronchial epithelial cell line (BEAS-2B) residing at air–liquid interface. SOAβPin-SP mostly contained oxygenated aliphatic compounds from photooxidation, whereas SOANap-SP significant fraction aromatic products under similar conditions. Following exposure,...
Abstract We present RBPNet, a novel deep learning method, which predicts CLIP-seq crosslink count distribution from RNA sequence at single-nucleotide resolution. By training on up to million regions, RBPNet achieves high generalization eCLIP, iCLIP and miCLIP assays, outperforming state-of-the-art classifiers. performs bias correction by modeling the raw signal as mixture of protein-specific background signal. Through model interrogation via Integrated Gradients, identifies predictive...
Abstract Long ncRNAs are often enriched in the nucleus and at chromatin, but whether their dissociation from chromatin is important for role transcription regulation unclear. Here, we group long using epigenetic marks, expression strength of chromosomal interactions; find that transcribed loci engaged strong long-range interactions less abundant suggesting release as a crucial functional aspect target genes. To gain mechanistic insight into this, functionally validate ncRNA A-ROD, which...
RNA-binding proteins (RBPs) play an important role in RNA post-transcriptional regulation and recognize target RNAs via sequence-structure motifs. The extent to which structure influences protein binding the presence or absence of a sequence motif is still poorly understood. Existing finders either take only partially into account, employ models are not directly interpretable as We developed ssHMM, finder based on hidden Markov model (HMM) Gibbs sampling fully captures relationship between...
Abstract RNA-binding proteins (RBPs) are central actors of RNA post-transcriptional regulation. Experiments to profile-binding sites RBPs in vivo limited transcripts expressed the experimental cell type, creating need for computational methods infer missing binding information. While numerous machine-learning based have been developed this task, their use heterogeneous training and evaluation datasets across different sets CLIP-seq protocols makes a direct comparison performance difficult....
Membrane proteins are important for many processes in the cell and used as main drug targets. The increasing number of high-resolution structures available makes first time a characterization local structural functional motifs α-helical transmembrane possible. MeMotif (http://projects.biotec.tu-dresden.de/memotif) is database wiki which collects more than 2000 known novel computationally predicted linear proteins. Motifs fully described terms several features editable. contained can be...
Long noncoding RNAs (lncRNAs) are transcripts generally longer than 200 nucleotides with no or poor protein coding potential, and most of their functions also poorly characterized. Recently, an increasing number studies have shown that lncRNAs can be involved in various critical biological processes such as organism development cancer progression. Little, however, is known about effects helminths parasites, Schistosoma mansoni . Here, we present a computational pipeline to identify...
Abstract Motivation: Misfolding of membrane proteins plays an important role in many human diseases such as retinitis pigmentosa, hereditary deafness and diabetes insipidus. Little is known about there are only very few high-resolution structures. Single-molecule force spectroscopy a novel technique, which measures the necessary to pull protein out membrane. Such curves contain valuable information on structure, conformation, inter- intra-molecular forces. High-throughput experiments...
A large proportion of an organism's genome encodes for membrane proteins. Membrane proteins are important many cellular processes, and several diseases can be linked to mutations in them. With the tremendous growth sequence data, there is increasing need reliably identify from sequence, functionally annotate them, correctly predict their topology.We introduce a technique called structural fragment clustering, which learns sequential motifs 3D fragments. From over 500,000 fragments, we obtain...
Although several studies have provided insights into the role of long non-coding RNAs (lncRNAs), majority them unknown function. Recent evidence has shown importance both lncRNAs and chromatin interactions in transcriptional regulation. network-based methods, mainly exploiting gene-lncRNA co-expression, been applied to characterize lncRNA function by means ’guilt-by-association’, no strategy exists so far which identifies mRNA-lncRNA functional modules based on 3D interaction graph. To...
Abstract iCLIP and eCLIP techniques facilitate the detection of protein-RNA interaction sites at high resolution, based on diagnostic events crosslink sites. However, previous methods do not explicitly model specifics truncation patterns possible biases. We developed PureCLIP, a hidden Markov approach, which simultaneously performs peak calling individual site detection. It incorporates RNA abundances and, for first time, non-specific sequence On both simulated real data, PureCLIP is more...
Abstract Summary Convolutional neural networks (CNNs) have been shown to perform exceptionally well in a variety of tasks, including biological sequence classification. Available implementations, however, are usually optimized for particular task and difficult reuse. To enable researchers utilize these more easily we implemented pysster, Python package training CNNs on data. Sequences classified by learning structure motifs the offers an automated hyper-parameter optimization procedure...
Enhancers are important regulatory regions located throughout the genome, primarily in non-coding regions. Several experimental methods have been developed over last several years to identify their location, but search space is large and overlap between putative enhancer identified using these tends be very small. Computational for prediction often use one set of experimentally as input, therefore rely critically on correctness. We chose take a different approach, start with high confidence...
Enhancers are important regulatory regions located throughout the genome, primarily in non-coding regions. Several experimental methods have been developed over last several years to identify their location, but search space is large and overlap between putative enhancer identified using these tends be very small. Computational for prediction often use one set of experimentally as input, therefore rely critically on correctness. We chose take a different approach, start with high confidence...