Guojun Li

ORCID: 0000-0003-1581-5897
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Gene expression and cancer classification
  • Genomics and Phylogenetic Studies
  • Advanced Graph Theory Research
  • RNA and protein synthesis mechanisms
  • Bioinformatics and Genomic Networks
  • Machine Learning in Bioinformatics
  • graph theory and CDMA systems
  • Genomics and Chromatin Dynamics
  • Algorithms and Data Compression
  • Protein Structure and Dynamics
  • Computational Drug Discovery Methods
  • Optimization and Search Problems
  • Graph Labeling and Dimension Problems
  • Interconnection Networks and Systems
  • Scheduling and Optimization Algorithms
  • Complexity and Algorithms in Graphs
  • Limits and Structures in Graph Theory
  • RNA modifications and cancer
  • Gene Regulatory Network Analysis
  • Advanced Optical Network Technologies
  • Genome Rearrangement Algorithms
  • DNA and Biological Computing
  • Enzyme Structure and Function
  • Graph theory and applications
  • Advanced Manufacturing and Logistics Optimization

Chinese Academy of Sciences
2004-2024

Shandong University
2015-2024

Center for Excellence in Molecular Plant Sciences
2024

University of Chinese Academy of Sciences
2024

Liaocheng University
2021-2023

University of North Carolina at Charlotte
2018-2019

State Key Laboratory of Microbial Technology
2018

Arkansas State University
2014-2017

University of Georgia
2005-2014

Hunan Agricultural University
2014

Abstract We present a new de novo transcriptome assembler, Bridger, which takes advantage of techniques employed in Cufflinks to overcome limitations the existing assemblers. When tested on dog, human, and mouse RNA-seq data, Bridger assembled more full-length reference transcripts while reporting considerably fewer candidate transcripts, hence greatly reducing false positive comparison with state-of-the-art It runs substantially faster requires much less memory space than most More...

10.1186/s13059-015-0596-2 article EN cc-by Genome Biology 2015-02-10

Biclustering extends the traditional clustering techniques by attempting to find (all) subgroups of genes with similar expression patterns under to-be-identified subsets experimental conditions when applied gene data. Still real power this strategy is yet be fully realized due lack effective and efficient algorithms for reliably solving general biclustering problem. We report a QUalitative BIClustering algorithm (QUBIC) that can solve problem in more form, compared existing algorithms,...

10.1093/nar/gkp491 article EN cc-by-nc Nucleic Acids Research 2009-06-09

High-throughput RNA-seq technology has provided an unprecedented opportunity to reveal the very complex structures of transcriptomes. However, it is important and highly challenging task assemble vast amounts short reads into transcriptomes with alternative splicing isoforms. In this study, we present a novel de novo assembler, BinPacker, by modeling transcriptome assembly problem as tracking set trajectories items their sizes representing coverage corresponding isoforms solving series...

10.1371/journal.pcbi.1004772 article EN cc-by PLoS Computational Biology 2016-02-19

Raman spectra have been widely used in biology, physics, and chemistry become an essential tool for the studies of macromolecules. Nevertheless, raw signal is often obscured by a broad background curve (or baseline) due to intrinsic fluorescence organic molecules, which leads unpredictable negative effects quantitative analysis spectra. Therefore, it correct this baseline before analyzing Polynomial fitting has proven be most convenient simplest method high accuracy. In polynomial fitting,...

10.1366/14-07798 article EN Applied Spectroscopy 2015-07-01

Abstract Biclustering algorithms, which aim to provide an effective and efficient way analyze gene expression data by finding a group of genes with trend-preserving patterns under certain conditions, have been widely developed since Morgan et al. pioneered work about partitioning matrix into submatrices approximately constant values. However, the identification general biclusters are most meaningful substructures hidden in remains highly challenging problem. We found elementary method...

10.1038/srep23466 article EN cc-by Scientific Reports 2016-03-22

Abstract Motivation: In this article, we develop a novel edge-based network i.e. edge-network, to detect early signals of diseases by identifying the corresponding edge-biomarkers with their dynamical biomarker score from biomarkers. Specifically, derive an edge-network based on second-order statistics representation gene expression profiles, which is able accurately represent stochastic dynamics original biological system (with Gaussian distribution assumption) combining traditional...

10.1093/bioinformatics/btt620 article EN Bioinformatics 2013-10-31

Transcriptome assembly using RNA-seq data - particularly in non-model organisms has been dramatically improved, but only recently have the pre-assembly procedures, such as sequencing depth and error correction, studied. Increasing read length is viewed a crucial condition to further improve transcriptome assembly, it unknown whether really matters. In addition, though many tools are available now, unclear existing assemblers perform well enough for all with different complexities. this...

10.1371/journal.pone.0094825 article EN cc-by PLoS ONE 2014-04-15

Identification of a few cancer driver mutation genes from much larger number passenger in samples remains highly challenging task. Here, novel method for distinguishing the by effective integration somatic data and molecular interaction using maximal mutational impact function (MaxMIF) is presented. When evaluated on six datasets Pan-Cancer 19 different types TCGA, MaxMIF almost always significantly outperforms all existing state-of-the-art methods terms predictive accuracy, sensitivity,...

10.1002/advs.201800640 article EN cc-by Advanced Science 2018-07-23

In the conventional analysis of complex diseases, control and case samples are assumed to be great purity. However, due heterogeneity disease samples, many genes even not always consistently up-/down-regulated, leading under-estimated. This problem will seriously influence effective personalized diagnosis or treatment. The expression variance covariance can address such a in network manner. But, these analyses require multiple rather than one sample, which is generally available clinical...

10.1186/s12967-015-0546-5 article EN cc-by Journal of Translational Medicine 2015-06-12

Phylogenetic footprinting is an important computational technique for identifying cis-regulatory motifs in orthologous regulatory regions from multiple genomes, as tend to evolve slower than their surrounding non-functional sequences. Its application, however, has several difficulties optimizing the selection of data and reducing false positives motif prediction.Here we present integrative phylogenetic framework accurate predictions prokaryotic genomes (MP(3)). The includes a new preparation...

10.1186/s12864-016-2982-x article EN cc-by BMC Genomics 2016-08-09

We present a new algorithm, BOBRO, for prediction of cis-regulatory motifs in given set promoter sequences. The algorithm substantially improves the accuracy and extends scope applicability existing programs based on two key ideas: (i) we developed highly effective method reliably assessing possibility each position to be (approximate) start conserved sequence motif; (ii) reliable way recognition actual from accidental ones concept 'motif closure'. These ideas are embedded classical...

10.1093/nar/gkq948 article EN cc-by-nc Nucleic Acids Research 2010-12-11

We present an integrated toolkit, BoBro2.0, for prediction and analysis of cis-regulatory motifs. This toolkit can (i) reliably identify statistically significant motifs at a genome scale; (ii) accurately scan all motif instances query in specified genomic regions using novel method P-value estimation; (iii) provide highly reliable comparisons clustering identified motifs, which takes into consideration the weak signals from flanking motifs; (iv) analyze co-occurring regulatory regions.We...

10.1093/bioinformatics/btt397 article EN Bioinformatics 2013-07-10

The circular chromosome of Escherichia coli has been suggested to fold into a collection sequentially consecutive domains, genes in each which tend be co-expressed. It also that such forming partition the genome, are dynamic with respect physiological conditions. However, little is known about DNA segments E. genome form these domains and what determines boundaries domain segments. We present computational model here segments, theoretically suggestive physically folded supercoiled along...

10.1093/nar/gkt261 article EN cc-by-nc Nucleic Acids Research 2013-04-17

The availability of numerous ChIP-seq datasets for transcription factors (TF) has provided an unprecedented opportunity to identify all TF binding sites in genomes. However, the progress been hindered by lack a highly efficient and accurate tool find not only target motifs, but also cooperative motifs very big datasets.We herein present ultrafast motif-finding algorithm, ProSampler, based on novel numeration method Gibbs sampler. ProSampler runs orders magnitude faster than fastest existing...

10.1093/bioinformatics/btz290 article EN Bioinformatics 2019-04-18

Abstract Regulons are the basic units of response system in a bacterial cell and each consists set transcriptionally co-regulated operons. Regulon elucidation is basis for studying global transcriptional regulation network. In this study, we designed novel co-regulation score between pair operons based on accurate operon identification cis regulatory motif analyses, which can capture their relationship much better than other scores. Taking full advantage discovery, developed new...

10.1038/srep23030 article EN cc-by Scientific Reports 2016-03-15

10.1016/j.dam.2004.11.004 article EN publisher-specific-oa Discrete Applied Mathematics 2005-01-15

RNA molecules can adopt stable secondary and tertiary structures, which are essential in mediating physical interactions with other partners such as binding proteins (RBPs) carrying out their cellular functions. In vivo vitro experiments RNAcompete eCLIP have revealed preferences of RBPs to oligomers sites cells. Analysis these data showed that the structure properties RNAs important determinants events; however, it has been a challenge incorporate information into an interpretable model....

10.1371/journal.pcbi.1010293 article EN cc-by PLoS Computational Biology 2022-07-12

Abstract Identifying significant biclusters of genes with specific expression patterns is an effective approach to reveal functionally correlated in gene data. However, none existing algorithms can simultaneously identify both broader and narrower due their failure balancing between effectiveness efficiency. We introduced ARBic, algorithm which capable accurately identifying any shape, including broader, square, large scale dataset. ARBic was designed by integrating column-based row-based...

10.1093/nargab/lqad009 article EN cc-by NAR Genomics and Bioinformatics 2023-01-10

Miniature inverted-repeat transposable elements (MITEs) are short DNA transposons with terminal inverted repeat (TIR) signals and have been extensively studied in plants other eukaryotes. But little is known about them eubacteria. We identified a novel recently active MITE, Chunjie, when studying the recent duplication of an operon consisting ABC transporters phosphate uptake regulator chromosome Geobacter uraniireducens Rf4. Chunjie resembles MITEs many aspects, e.g., having TIR direct...

10.1534/genetics.108.089995 article EN Genetics 2008-07-29

10.1016/j.laa.2013.08.041 article EN publisher-specific-oa Linear Algebra and its Applications 2013-09-18

Pathway enrichment analysis is a useful tool to study biology and biomedicine, due its functional screening on well-defined biological procedures rather than separate molecules. The measurement of malfunctions pathways with phenotype change, e.g., from normal diseased, the key issue when applying pathway. differentially expressed genes (DEGs) are widely focused in conventional analysis, which based great purity samples. However, disease samples usually heterogeneous, so that, differential...

10.1186/s12864-015-2188-7 article EN cc-by BMC Genomics 2015-11-10
Coming Soon ...