- RNA and protein synthesis mechanisms
- RNA modifications and cancer
- Machine Learning in Bioinformatics
- Genomics and Phylogenetic Studies
- Genomics and Chromatin Dynamics
- RNA Research and Splicing
- Molecular Biology Techniques and Applications
- Protein Structure and Dynamics
- Computational Drug Discovery Methods
- Gene Regulatory Network Analysis
- Antibiotic Resistance in Bacteria
- Microbial Metabolic Engineering and Bioproduction
- Tuberculosis Research and Epidemiology
- Pharmaceutical and Antibiotic Environmental Impacts
- Cancer-related molecular mechanisms research
- Cell Image Analysis Techniques
- Colorectal Cancer Treatments and Studies
- Cryptographic Implementations and Security
- Chaos-based Image/Signal Encryption
- Neural Networks and Applications
- Genetic factors in colorectal cancer
- Cancer Genomics and Diagnostics
- Viral Infectious Diseases and Gene Expression in Insects
- Cancer-related gene regulation
- Bacteriophages and microbial interactions
RIKEN Center for Integrative Medical Sciences
2023-2024
King Abdullah University of Science and Technology
2015-2021
Bioscience Research
2021
Hiroshima University
2020-2021
University of Utah
2017
Tsinghua University
2017
Annotation of enzyme function has a broad range applications, such as metagenomics, industrial biotechnology, and diagnosis deficiency-caused diseases. However, the time resource required make it prohibitively expensive to experimentally determine every enzyme. Therefore, computational prediction become increasingly important. In this paper, we develop an approach, determining by predicting Enzyme Commission number.We propose end-to-end feature selection classification model training well...
Accurate computational identification of promoters remains a challenge as these key DNA regulatory regions have variable structures composed functional motifs that provide gene-specific initiation transcription. In this paper we utilize Convolutional Neural Networks (CNN) to analyze sequence characteristics prokaryotic and eukaryotic build their predictive models. We trained similar CNN architecture on five distant organisms: human, mouse, plant (Arabidopsis), two bacteria (Escherichia coli...
Our current knowledge of eukaryotic promoters indicates their complex architecture that is often composed numerous functional motifs. Most known include multiple and in some cases mutually exclusive transcription start sites (TSSs). Moreover, TSS selection depends on cell/tissue, development stage environmental conditions. Such promoter structures make computational identification notoriously difficult. Here, we present TSSPlant, a novel tool predicts both TATA TATA-less sequences wide...
Abstract Motivation Computational identification of promoters is notoriously difficult as human genes often have unique promoter sequences that provide regulation transcription and interaction with initiation complex. While there are many attempts to develop computational methods, we no reliable tool analyze long genomic sequences. Results In this work, further our deep learning approach was relatively successful discriminate short non-promoter Instead focusing on the classification...
Abstract Protein-RNA interaction plays important roles in post-transcriptional regulation. However, the task of predicting these interactions given a protein structure is difficult. Here we show that, by leveraging deep learning model NucleicNet, attributes such as binding preference RNA backbone constituents and different bases can be predicted from local physicochemical characteristics surface. On diverse set challenging RNA-binding proteins, including Fem-3-binding-factor 2, Argonaute 2...
Abstract Background At least 50% of patients with suspected Mendelian disorders remain undiagnosed after whole-exome sequencing (WES), and the extent to which non-coding variants that are not captured by WES contribute this fraction is unclear. Whole transcriptome a promising supplement WES, although empirical data on contribution RNA analysis diagnosis diseases large scale scarce. Results Here, we describe our experience transcript-deleterious (TDVs) based cohort 5647 families diseases. We...
Abstract Background The spread of antibiotic resistance has become one the most urgent threats to global health, which is estimated cause 700,000 deaths each year globally. Its surrogates, genes (ARGs), are highly transmittable between food, water, animal, and human mitigate efficacy antibiotics. Accurately identifying ARGs thus an indispensable step understanding ecology, transmission environmental human-associated reservoirs. Unfortunately, previous computational methods for mostly based...
Abstract Motivation An accurate characterization of transcription factor (TF)-DNA affinity landscape is crucial to a quantitative understanding the molecular mechanisms underpinning endogenous gene regulation. While recent advances in biotechnology have brought opportunity for building binding prediction methods, TF-DNA still remains challenging problem. Results Here we propose novel sequence embedding approach modeling landscape. Our method represents DNA sequences as hidden Markov model...
In this paper, we propose an end-to-end deep learning model, called E2Efold, for RNA secondary structure prediction which can effectively take into account the inherent constraints in problem. The key idea of E2Efold is to directly predict base-pairing matrix, and use unrolled algorithm constrained programming as template architectures enforce constraints. With comprehensive experiments on benchmark datasets, demonstrate superior performance E2Efold: it predicts significantly better...
Drug treatment induces cell type specific transcriptional programs, and as the number of combinations drugs types grows, cost for exhaustive screens measuring drug response becomes intractable. We developed DeepCellState, a deep learning autoencoder-based framework, predicting induced state in after treatment, based on another type. Training method large collection perturbation profiles, prediction accuracy improves significantly over baseline alternative approaches when applying to two...
Regulatory elements control gene expression through transcription initiation (promoters) and by enhancing at distant regions (enhancers). Accurate identification of regulatory is fundamental for annotating genomes understanding patterns. While there are many attempts to develop computational promoter enhancer methods, reliable tools analyze long genomic sequences still lacking. Prediction methods often perform poorly on the genome-wide scale because number negatives much higher than that in...
The Synthetic Biology Open Language (SBOL) is a community-driven open language to promote standardization in synthetic biology. To support the use of SBOL metabolic engineering, we developed SBOLme, first open-access repository 2-compliant biochemical parts for wide range engineering applications. URL our http://www.cbrc.kaust.edu.sa/sbolme.
Computational identification of promoters is notoriously difficult as human genes often have unique promoter sequences that provide regulation transcription and interaction with initiation complex. While there are many attempts to develop computational methods, we no reliable tool analyze long genomic sequences. In this work further our deep learning approach was relatively successful discriminate short non-promoter Instead focusing on the classification accuracy, in predict exact positions...
Accurate computational identification of promoters remains a challenge as these key DNA regulatory regions have variable structures composed functional motifs that provide gene specific initiation transcription. In this paper we utilize Convolutional Neural Networks (CNN) to analyze sequence characteristics prokaryotic and eukaryotic build their predictive models. We trained the same CNN architecture on four very distant organisms: human, plant (Arabidopsis), two bacteria (Escherichia coli...
Abstract Long-read sequencing has emerged as a powerful tool for uncovering novel transcripts and genes. However, existing protocols often lack confidence in identifying the transcription start site (TSS) fail to capture non-poly(A) RNA, thereby limiting discovery of genes, particularly long non-coding RNAs (lncRNAs). In this study, we introduce Cap-trap full-length cDNA (CFC-seq), comprehensive protocol that combines Cap-trapping poly(A)-tailing with Oxford Nanopore sequencing. This enables...
To engineer cells for industrial-scale application, a deep understanding of how to design molecular control mechanisms tightly maintain functional stability under various fluctuations is crucial. Absolute concentration robustness (ACR) category in reaction network models which the steady-state species guaranteed be invariant even with perturbations other network. Here, we introduce software tool, absolute explorer (ACRE), efficiently explores combinatorial biochemical networks ACR property....
Abstract Drug treatment induces cell type-specific transcriptional programs, and as the number of combinations drugs types grows, cost for exhaustive screens measuring drug response becomes intractable. We developed DeepCellState, a deep learning autoencoder-based framework, predicting induced state in type after treatment, based on another type. Training method large collection perturbation profiles, prediction accuracy improves significantly over baseline alternative approaches when...
Abstract Regulatory elements control gene expression through transcription initiation (promoters) and by enhancing at distant regions (enhancers). Accurate identification of regulatory is fundamental for annotating genomes understanding patterns. While there are many attempts to develop computational promoter enhancer methods, reliable tools analyze long genomic sequences still lacking. Prediction methods often perform poorly on the genome-wide scale because number negatives much higher than...
The paper describes several original cryptographic cipher modules (VSEM) that are based on using one time pseudorandom pad and transpositions. VSEM includes 4 of encryption can be applied in combinations. We studied ability these to secure the private data against attacks their speed encryption. was implemented Fendoff applications for mobile devices iOS Android platforms as well computer application running Window or Mac OS. describe designed encrypt/decrypt various personal such passwords,...