- Gene expression and cancer classification
- Genomics and Phylogenetic Studies
- Advanced Graph Theory Research
- RNA and protein synthesis mechanisms
- Bioinformatics and Genomic Networks
- Machine Learning in Bioinformatics
- graph theory and CDMA systems
- Genomics and Chromatin Dynamics
- Algorithms and Data Compression
- Protein Structure and Dynamics
- Computational Drug Discovery Methods
- Optimization and Search Problems
- Graph Labeling and Dimension Problems
- Interconnection Networks and Systems
- Scheduling and Optimization Algorithms
- Complexity and Algorithms in Graphs
- Limits and Structures in Graph Theory
- RNA modifications and cancer
- Gene Regulatory Network Analysis
- Advanced Optical Network Technologies
- Genome Rearrangement Algorithms
- DNA and Biological Computing
- Enzyme Structure and Function
- Graph theory and applications
- Advanced Manufacturing and Logistics Optimization
Chinese Academy of Sciences
2004-2024
Shandong University
2015-2024
Center for Excellence in Molecular Plant Sciences
2024
University of Chinese Academy of Sciences
2024
Liaocheng University
2021-2023
University of North Carolina at Charlotte
2018-2019
State Key Laboratory of Microbial Technology
2018
Arkansas State University
2014-2017
University of Georgia
2005-2014
Hunan Agricultural University
2014
Abstract We present a new de novo transcriptome assembler, Bridger, which takes advantage of techniques employed in Cufflinks to overcome limitations the existing assemblers. When tested on dog, human, and mouse RNA-seq data, Bridger assembled more full-length reference transcripts while reporting considerably fewer candidate transcripts, hence greatly reducing false positive comparison with state-of-the-art It runs substantially faster requires much less memory space than most More...
Biclustering extends the traditional clustering techniques by attempting to find (all) subgroups of genes with similar expression patterns under to-be-identified subsets experimental conditions when applied gene data. Still real power this strategy is yet be fully realized due lack effective and efficient algorithms for reliably solving general biclustering problem. We report a QUalitative BIClustering algorithm (QUBIC) that can solve problem in more form, compared existing algorithms,...
High-throughput RNA-seq technology has provided an unprecedented opportunity to reveal the very complex structures of transcriptomes. However, it is important and highly challenging task assemble vast amounts short reads into transcriptomes with alternative splicing isoforms. In this study, we present a novel de novo assembler, BinPacker, by modeling transcriptome assembly problem as tracking set trajectories items their sizes representing coverage corresponding isoforms solving series...
Raman spectra have been widely used in biology, physics, and chemistry become an essential tool for the studies of macromolecules. Nevertheless, raw signal is often obscured by a broad background curve (or baseline) due to intrinsic fluorescence organic molecules, which leads unpredictable negative effects quantitative analysis spectra. Therefore, it correct this baseline before analyzing Polynomial fitting has proven be most convenient simplest method high accuracy. In polynomial fitting,...
Abstract Biclustering algorithms, which aim to provide an effective and efficient way analyze gene expression data by finding a group of genes with trend-preserving patterns under certain conditions, have been widely developed since Morgan et al. pioneered work about partitioning matrix into submatrices approximately constant values. However, the identification general biclusters are most meaningful substructures hidden in remains highly challenging problem. We found elementary method...
Abstract Motivation: In this article, we develop a novel edge-based network i.e. edge-network, to detect early signals of diseases by identifying the corresponding edge-biomarkers with their dynamical biomarker score from biomarkers. Specifically, derive an edge-network based on second-order statistics representation gene expression profiles, which is able accurately represent stochastic dynamics original biological system (with Gaussian distribution assumption) combining traditional...
Transcriptome assembly using RNA-seq data - particularly in non-model organisms has been dramatically improved, but only recently have the pre-assembly procedures, such as sequencing depth and error correction, studied. Increasing read length is viewed a crucial condition to further improve transcriptome assembly, it unknown whether really matters. In addition, though many tools are available now, unclear existing assemblers perform well enough for all with different complexities. this...
Identification of a few cancer driver mutation genes from much larger number passenger in samples remains highly challenging task. Here, novel method for distinguishing the by effective integration somatic data and molecular interaction using maximal mutational impact function (MaxMIF) is presented. When evaluated on six datasets Pan-Cancer 19 different types TCGA, MaxMIF almost always significantly outperforms all existing state-of-the-art methods terms predictive accuracy, sensitivity,...
In the conventional analysis of complex diseases, control and case samples are assumed to be great purity. However, due heterogeneity disease samples, many genes even not always consistently up-/down-regulated, leading under-estimated. This problem will seriously influence effective personalized diagnosis or treatment. The expression variance covariance can address such a in network manner. But, these analyses require multiple rather than one sample, which is generally available clinical...
Phylogenetic footprinting is an important computational technique for identifying cis-regulatory motifs in orthologous regulatory regions from multiple genomes, as tend to evolve slower than their surrounding non-functional sequences. Its application, however, has several difficulties optimizing the selection of data and reducing false positives motif prediction.Here we present integrative phylogenetic framework accurate predictions prokaryotic genomes (MP(3)). The includes a new preparation...
We present a new algorithm, BOBRO, for prediction of cis-regulatory motifs in given set promoter sequences. The algorithm substantially improves the accuracy and extends scope applicability existing programs based on two key ideas: (i) we developed highly effective method reliably assessing possibility each position to be (approximate) start conserved sequence motif; (ii) reliable way recognition actual from accidental ones concept 'motif closure'. These ideas are embedded classical...
We present an integrated toolkit, BoBro2.0, for prediction and analysis of cis-regulatory motifs. This toolkit can (i) reliably identify statistically significant motifs at a genome scale; (ii) accurately scan all motif instances query in specified genomic regions using novel method P-value estimation; (iii) provide highly reliable comparisons clustering identified motifs, which takes into consideration the weak signals from flanking motifs; (iv) analyze co-occurring regulatory regions.We...
The circular chromosome of Escherichia coli has been suggested to fold into a collection sequentially consecutive domains, genes in each which tend be co-expressed. It also that such forming partition the genome, are dynamic with respect physiological conditions. However, little is known about DNA segments E. genome form these domains and what determines boundaries domain segments. We present computational model here segments, theoretically suggestive physically folded supercoiled along...
The availability of numerous ChIP-seq datasets for transcription factors (TF) has provided an unprecedented opportunity to identify all TF binding sites in genomes. However, the progress been hindered by lack a highly efficient and accurate tool find not only target motifs, but also cooperative motifs very big datasets.We herein present ultrafast motif-finding algorithm, ProSampler, based on novel numeration method Gibbs sampler. ProSampler runs orders magnitude faster than fastest existing...
Abstract Regulons are the basic units of response system in a bacterial cell and each consists set transcriptionally co-regulated operons. Regulon elucidation is basis for studying global transcriptional regulation network. In this study, we designed novel co-regulation score between pair operons based on accurate operon identification cis regulatory motif analyses, which can capture their relationship much better than other scores. Taking full advantage discovery, developed new...
RNA molecules can adopt stable secondary and tertiary structures, which are essential in mediating physical interactions with other partners such as binding proteins (RBPs) carrying out their cellular functions. In vivo vitro experiments RNAcompete eCLIP have revealed preferences of RBPs to oligomers sites cells. Analysis these data showed that the structure properties RNAs important determinants events; however, it has been a challenge incorporate information into an interpretable model....
Abstract Identifying significant biclusters of genes with specific expression patterns is an effective approach to reveal functionally correlated in gene data. However, none existing algorithms can simultaneously identify both broader and narrower due their failure balancing between effectiveness efficiency. We introduced ARBic, algorithm which capable accurately identifying any shape, including broader, square, large scale dataset. ARBic was designed by integrating column-based row-based...
Miniature inverted-repeat transposable elements (MITEs) are short DNA transposons with terminal inverted repeat (TIR) signals and have been extensively studied in plants other eukaryotes. But little is known about them eubacteria. We identified a novel recently active MITE, Chunjie, when studying the recent duplication of an operon consisting ABC transporters phosphate uptake regulator chromosome Geobacter uraniireducens Rf4. Chunjie resembles MITEs many aspects, e.g., having TIR direct...
Pathway enrichment analysis is a useful tool to study biology and biomedicine, due its functional screening on well-defined biological procedures rather than separate molecules. The measurement of malfunctions pathways with phenotype change, e.g., from normal diseased, the key issue when applying pathway. differentially expressed genes (DEGs) are widely focused in conventional analysis, which based great purity samples. However, disease samples usually heterogeneous, so that, differential...