- RNA modifications and cancer
- Genomics and Phylogenetic Studies
- Epigenetics and DNA Methylation
- Genetic Syndromes and Imprinting
- Microbial Community Ecology and Physiology
- Single-cell and spatial transcriptomics
- Genetic Neurodegenerative Diseases
- Bacteriophages and microbial interactions
- SARS-CoV-2 and COVID-19 Research
- Mitochondrial Function and Pathology
- Environmental DNA in Biodiversity Studies
- Chromosomal and Genetic Variations
- Plant Molecular Biology Research
- RNA and protein synthesis mechanisms
- Vibrio bacteria research studies
- Cancer-related gene regulation
- SARS-CoV-2 detection and testing
- Poxvirus research and outbreaks
- Ethics in Clinical Research
- Cancer-related molecular mechanisms research
- Genomics and Chromatin Dynamics
- Protist diversity and phylogeny
- Animal Virus Infections Studies
- Genetics, Aging, and Longevity in Model Organisms
- Mosquito-borne diseases and control
Chinese Academy of Sciences
2012-2025
Beijing Institute of Genomics
2019-2025
Institute of Genetics and Developmental Biology
2013-2024
Shanghai Center For Bioinformation Technology
2020-2023
Henan Normal University
2023
Chinese National Human Genome Center
2022
University of Iowa
2018
State Key Laboratory of Plant Genomics
2013-2017
University of Chinese Academy of Sciences
2013-2017
Small RNAs (smRNAs) in plants, mainly microRNAs and small interfering RNAs, play important roles both transcriptional post-transcriptional gene regulation.The broad application of highthroughput sequencing technology has made routinely generation bulk smRNA sequences laboratories possible, thus significantly increased the need for batch analysis tools.PsRobot is a web-based easy-to-use tool dedicated to identification smRNAs with stem-loop shaped precursors (such as short hairpin RNAs) their...
The National Genomics Data Center (NGDC), part of the China for Bioinformation (CNCB), provides a family database resources to support global academic and industrial communities. With explosive accumulation multi-omics data generated at an unprecedented rate, CNCB-NGDC constantly expands updates core by big archive, integrative analysis value-added curation. In past year, efforts have been devoted integrating multiple omics data, synthesizing growing knowledge, developing new upgrading set...
Organismal aging is driven by interconnected molecular changes encompassing internal and extracellular factors. Combinational analysis of high-throughput 'multi-omics' datasets (gathering information from genomics, epigenomics, transcriptomics, proteomics, metabolomics pharmacogenomics), at either populational or single-cell levels, can provide a multi-dimensional, integrated profile the heterogeneous process with unprecedented throughput detail. These new strategies allow for exploration...
The Genome Warehouse (GWH) is a public repository housing genome assembly data for wide range of species and delivering series web services submission, storage, release, sharing. As one the core resources in National Genomics Data Center (NGDC), part China Bioinformation (CNCB; https://ngdc.cncb.ac.cn), GWH accepts both full partial (chloroplast, mitochondrion, plasmid) sequences with different levels, as well an update existing assemblies. For each assembly, collects detailed genome-related...
Abstract On January 22, 2020, China National Center for Bioinformation (CNCB) released the 2019 Novel Coronavirus Resource (2019nCoVR), an open-access information resource severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). 2019nCoVR features a comprehensive integration of sequence and clinical all publicly available SARS-CoV-2 isolates, which are manually curated with value-added annotations quality evaluated by automated in-house pipeline. Of particular note, offers systematic...
Epigenome-Wide Association Study (EWAS) has become a standard strategy to discover DNA methylation variation of different phenotypes. Since 2018, we have developed EWAS Atlas and Data Hub integrate growing volume knowledge data, respectively. Here, present Open Platform (https://ngdc.cncb.ac.cn/ewas) that includes Atlas, the newly Toolkit. In current implementation, integrates 617 018 high-quality associations from 910 publications, covering 51 phenotypes, 275 diseases 104 environmental...
Abstract Epigenome-Wide Association Study (EWAS) has become an effective strategy to explore epigenetic basis of complex traits. Over the past decade, a large amount data, especially those sourced from DNA methylation array, been accumulated as result numerous EWAS projects. We present Data Hub (https://bigd.big.ac.cn/ewas/datahub), resource for collecting and normalizing array data well archiving associated metadata. The current release integrates comprehensive collection 75 344 samples...
The Genome Warehouse (GWH), accessible at https://ngdc.cncb.ac.cn/gwh, is an extensively utilized public repository dedicated to the deposition, management and sharing of genome assembly sequences, annotations, metadata. This paper highlights noteworthy enhancements GWH since 2021 version, emphasizing substantial advancements in web interfaces for data submission, database functionality updates, resource integration. Key updates include reannotation released prokaryotic genomes, mirroring...
Abstract Motivation In the past years, long read (LR) sequencing technologies, such as Pacific Biosciences and Oxford Nanopore Technologies, have been demonstrated to substantially improve quality of genome assembly transcriptome characterization. Compared high cost by LR sequencing, it is more affordable generate LRs for That is, when informative data are available without a high-quality genome, method de novo annotation demand. Results Without reference IDP-denovo performs assembly,...
Single-cell bisulfite sequencing methods are widely used to assess epigenomic heterogeneity in cell states. Over the past few years, large amounts of data have been generated and facilitated deeper understanding epigenetic regulation many key biological processes including early embryonic development, differentiation tumor progression. It is an urgent need build a functional resource platform with massive amount data. Here, we present scMethBank, first open access comprehensive database...
Monkeypox is a viral zoonotic disease endemic in Central and West Africa. Since January 1, 2022, 3413 laboratory-confirmed monkeypox cases one death have been reported from 50 countries/territories five WHO regions (as of June 22, 2022; https://www.who.int/emergencies/disease-outbreak-news/item/2022-DON396), 1310 new eight countries the past week. Genomic epidemiology vital to determine similarity between viruses suggest possible links cases, origins infection, transmission dynamics when...
The Illumina HumanMethylation BeadChip is one of the most cost-effective methods to quantify DNA methylation levels at single-base resolution across human genome, which makes it a routine platform for epigenome-wide association studies. It has accumulated tens thousands array samples in public databases, providing great support data integration and further analysis. However, majority are deposited as processed without background probes widely used normalization. Here, we present Gaussian...
ABSTRACT Aeromonas veronii. a ubiquitous of zoonotic disease pathogen, depends on adhesion as the crucial way to colonize gastrointestinal tract humans and animals, which further causes severe diseases parenteral infections. However, adherence mechanism A. veronii has not been fully characterized. Therefore, we investigate effect autoinducer-2 (AI-2) through facilitating expression mannose-sensitive hemagglutinin (MSHA) type IV pili genes mediated by cyclic diguanosine monophosphate...
Abstract Protists, a highly diverse group of microscopic eukaryotic organisms distinct from fungi, animals and plants, exert crucial roles within the earth's biosphere. However, genomes only small fraction known protist species have been published made publicly accessible. To address this constraint, Protist 10 000 Genomes Project (P10K) was initiated, implementing specialized pipeline for single-cell genome/transcriptome assembly, decontamination annotation protists. The resultant P10K...
Abstract DNA methylation, as the most intensively studied epigenetic mark, regulates gene expression in numerous biological processes including development, aging, and disease. With rapid accumulation of whole-genome bisulfite sequencing data, integrating, archiving, analyzing, visualizing those data becomes critical. Since its first publication 2015, MethBank has been continuously updated to include more methylomes across diverse species. Here, we present 4.0...
Abstract Perennial woody plants hold vital ecological significance, distinguished by their unique traits. While significant progress has been made in genomic and functional studies, a major challenge persists: the absence of comprehensive reference platform for collection, integration in-depth analysis vast amount data. Here, we present PPGR (Resource Plant Genomes Regulation; https://ngdc.cncb.ac.cn/ppgr/) to address this critical gap, collecting, integrating, analyzing visualizing genomic,...
Abstract Summary: Integrative Short Reads NAvigator (ISRNA) is an online toolkit for analyzing high-throughput small RNA sequencing data. Besides the high-speed genome mapping function, ISRNA provides statistics genomic location, length distribution and nucleotide composition bias analysis of sequence reads. Number reads mapped to known microRNAs other classes short non-coding RNAs, coverage on genes, expression abundance as well some functions are also supported. The versatile search enable...
Abstract On 22 January 2020, the National Genomics Data Center (NGDC), part of China for Bioinformation (CNCB), created 2019 Novel Coronavirus Resource (2019nCoVR), an open-access SARS-CoV-2 information resource. 2019nCoVR features a comprehensive integration sequence and clinical all publicly available isolates, which are manually curated with value-added annotations quality evaluated by our in-house automated pipeline. Of particular note, performs systematic analyses to generate dynamic...
Abstract The Genome Warehouse (GWH) is a public repository housing genome assembly data for wide range of species and delivering series web services submission, storage, release, sharing. As one the core resources in National Genomics Data Center (NGDC), part China Bioinformation (CNCB, https://bigd.big.ac.cn/ ), GWH accepts both full partial (chloroplast, mitochondrion, plasmid) sequences with different levels, as well an update existing assemblies. For each assembly, collects detailed...
Abstract Expansion of tandem repeats in genes often causes severe neuromuscular diseases, such as fragile X syndrome, huntington’s disease, and spinocerebellar ataxia. However, information on associated with repeat expansion diseases is scattered throughout the literature, systematic prediction potential that may cause via also lacking. Here, we develop DRED, a Database related to Repeat Diseases, manually-curated database covers all known 61 reported PubMed OMIM, along detailed for each...
Abstract Illumina HumanMethylation BeadChip is one of the most cost-effective ways to quantify DNA methylation levels at single-base level across human genome, which makes it a routine platform for epigenome-wide association studies. It has accumulated tens thousands array samples in public databases, thus provide great support data integration and further analysis. However, majority are deposited as processed without background probes widely used normalization. Here we present Gaussian...
Due to the explosion of cancer genome data and urgent needs for treatment, it is becoming increasingly important necessary easily timely analyze annotate genomes. However, tumor heterogeneity recognized as a serious barrier genomes at individual patient level. In addition, interpretation analysis multi-omics rely heavily on existing database resources that are often located in different centers or research institutions, which poses huge challenge parsing. Here we present CCAS (Cancer...