- Machine Learning in Bioinformatics
- RNA and protein synthesis mechanisms
- Genomics and Chromatin Dynamics
- Bioinformatics and Genomic Networks
- Computational Drug Discovery Methods
- Genomics and Phylogenetic Studies
- vaccines and immunoinformatics approaches
- RNA Research and Splicing
- Single-cell and spatial transcriptomics
- Gene expression and cancer classification
- Circular RNAs in diseases
- Cancer-related molecular mechanisms research
- Genetic Mapping and Diversity in Plants and Animals
- Metabolomics and Mass Spectrometry Studies
- Cancer Genomics and Diagnostics
- Graph Theory and Algorithms
- Antimicrobial Peptides and Activities
- Protein Structure and Dynamics
- Machine Learning in Healthcare
- Advanced Proteomics Techniques and Applications
- Advanced Graph Neural Networks
- Extracellular vesicles in disease
- Gut microbiota and health
- Multimodal Machine Learning Applications
- Colorectal Cancer Screening and Detection
Eastern Institute of Technology, Ningbo
2024
Tongji University
2020-2023
Shaanxi Normal University
2017-2019
Abstract DNA/RNA motif mining is the foundation of gene function research. The plays an extremely important role in identifying DNA- or RNA-protein binding site, which helps to understand mechanism regulation and management. For past few decades, researchers have been working on designing new efficient accurate algorithms for motif. These can be roughly divided into two categories: enumeration approach probabilistic method. In recent years, machine learning methods had made great progress,...
Abstract Transcription factors (TFs) play an important role in regulating gene expression, thus identification of the regions bound by them has become a fundamental step for molecular and cellular biology. In recent years, increasing number deep learning (DL) based methods have been proposed predicting TF binding sites (TFBSs) achieved impressive prediction performance. However, these mainly focus on sequence specificity TF-DNA binding, which is equivalent to sequence-level binary...
The study of transcriptional regulation is still difficult yet fundamental in molecular biology research. Recent research has shown that the double helix structure nucleotides plays an important role improving accuracy and interpretability transcription factor binding sites (TFBSs). Although several computational methods have been designed to take both DNA sequence shape features into consideration simultaneously, how design efficient model intractable topic. In this paper, we proposed a...
Transcription factors (TFs) play an important role in regulating gene expression, thus the identification of sites bound by them has become a fundamental step for molecular and cellular biology. In this paper, we developed deep learning framework leveraging existing fully convolutional neural networks (FCN) to predict TF-DNA binding signals at base-resolution level (named as FCNsignal). The proposed FCNsignal can simultaneously achieve following tasks: (i) modeling regions; (ii)...
Cross-species prediction of TF binding remains a major challenge due to the rapid evolutionary turnover individual sites, resulting in cross-species predictive performance being consistently worse than within-species performance. In this study, novel Nucleotide-Level Deep Neural Network (NLDNN) is first proposed predict within or across species. NLDNN regards task as nucleotide-level regression task, which takes DNA sequences input and directly predicts experimental coverage values. Beyond...
Abstract Motivation Cell-type-specific gene expression is maintained in large part by transcription factors (TFs) selectively binding to distinct sets of sites different cell types. Recent research works have provided evidence that such cell-type-specific determined TF’s intrinsic sequence preferences, cooperative interactions with co-factors, chromatin landscapes and 3D interactions. However, computational prediction characterization shared rarely studied. Results In this article, we...
Abstract The advent of single-cell sequencing technologies has revolutionized cell biology studies. However, integrative analyses diverse data face serious challenges, including technological noise, sample heterogeneity, and different modalities species. To address these problems, we propose scCorrector, a variational autoencoder-based model that can integrate from studies map them into common space. Specifically, designed Study Specific Adaptive Normalization for each study in decoder to...
Essential proteins are critical to the development and survival of cells. Identifying analyzing essential is vital understand molecular mechanisms living cells design new drugs. With high-throughput technologies, many protein–protein interaction (PPI) data available, which facilitates studies at network level. Up now, although various computational methods have been proposed, prediction precision still needs be improved. In this paper, we propose a novel method by applying Hyperlink-Induced...
The abuse of traditional antibiotics has led to increased resistance bacteria and viruses. Efficient therapeutic peptide prediction is critical for drug discovery. However, most the existing methods only make effective predictions one class peptides. It worth noting that currently no predictive method considers sequence length information as a distinct feature In this article, novel deep learning approach with matrix factorization predicting peptides (DeepTPpred) by integrating are proposed....
DNA-binding proteins (DBPs) play vital roles in the regulation of biological systems. Although there are already many deep learning methods for predicting sequence specificities DBPs, they face two challenges as follows. Classic DBPs prediction usually fail to capture dependencies between genomic sequences since their commonly used one-hot codes mutually orthogonal. Besides, these perform poorly when samples inadequate. To address challenges, we developed a novel language model mining using...
Deciphering the relationship between transcription factors (TFs) and DNA sequences is very helpful for computational inference of gene regulation a comprehensive understanding mechanisms. Transcription factor binding sites (TFBSs) are specific short that play pivotal role in controlling expression through interaction with TF proteins. Although recently many deep learning methods have been proposed to predict TFBSs aiming sequence specificity TF-DNA binding, there still lack effective...
In recent years, major advances have been made in various chromosome conformation capture technologies to further satisfy the needs of researchers for high-quality, high-resolution contact interactions. Discriminating loops from genome-wide interactions is crucial dissecting three-dimensional(3D) genome structure and function. Here, we present a deep learning method predict chromatin loops, called DLoopCaller, by combining accessible landscapes raw Hi-C maps. Some available orthogonal data...
Transcription factors (TFs) play a part in gene expression. TFs can form complex expression regulation system by combining with DNA. Thereby, identifying the binding regions has become an indispensable step for understanding regulatory mechanism of Due to great achievements applying deep learning (DL) computer vision and language processing recent years, many scholars are inspired use these methods predict TF sites (TFBSs), achieving extraordinary results. However, mainly focus on whether...
Introduction: CircRNA-protein binding plays a critical role in complex biological activity and disease. Various deep learning-based algorithms have been proposed to identify sites. These methods predict whether the CircRNA sequence includes protein sites from level, primarily concentrate on analysing specificity of binding. For model performance, these are unsatisfactory accurately predicting motif that special functions gene expression. Methods: In this study, based learning models...
Discovery of transcription factor binding sites (TFBSs) is primary importance for understanding the underlying mechanic and gene regulation process. Growing evidence indicates that apart from DNA sequences, shape landscape has a significant influence on preference. To effectively model co-influence sequence features, we emphasize position information motif pattern. In this paper, propose novel deep learning-based architecture, named hybridShape eDeepCNN, TFBS prediction which integrates in...
With advances in microbiomics, the crucial role of microbes disease progression is increasingly recognized. However, predicting phenotypes using microbiome data remains challenging due to complexity, heterogeneity, and limited model generalization. Current methods often depend on specific datasets are vulnerable adversarial attacks. To address these issues, this paper introduces a novel Noise Perturbation Ensemble Neural Network (NPENN), which combines noise mechanisms with Gradient Boosting...
In graph-level representation learning tasks, graph neural networks have received much attention for their powerful feature capabilities. However, with the increasing scales of data, how to efficiently process and extract key information has become focus research. The pooling technique, as a step in networks, simplifies structure by merging nodes or subgraphs, which significantly improves computational efficiency extraction ability networks. Although various methods been proposed numerous...
Identification of essential proteins plays an important role for understanding the cellular life activity and development in postgenomic era. from protein-protein interaction (PPI) networks has become a hot topic recent years. In this work, fruit fly optimization algorithm (FOA) is extended identifying proteins, called EPFOA, which merges FOA with topological properties biological information identification. The EPFOA advantage multiple simultaneously rather than completely relying on...
Abstract Transcription factors (TFs) play an important role in regulating gene expression, thus the identification of sites bound by them has become a fundamental step for molecular and cellular biology. In this paper, we developed deep learning framework leveraging existing fully convolutional neural networks (FCN) to predict TF-DNA binding signals at base-resolution level, called FCNsignal. The proposed FCNsignal can simultaneously achieve following tasks: (i) modeling regions; (ii)...
Imbalances in gut microbes have been implied many human diseases, including colorectal cancer (CRC), inflammatory bowel disease, type 2 diabetes, obesity, autism, and Alzheimer's disease. Compared with other CRC is a gastrointestinal malignancy high mortality probability of metastasis. However, current studies mainly focus on the prediction while neglecting more serious metastatic (mCRC). In addition, dimensionality small samples lead to complexity microbial data, which increases difficulty...