- Genomics and Phylogenetic Studies
- Chromosomal and Genetic Variations
- Plant Physiology and Cultivation Studies
- RNA and protein synthesis mechanisms
- RNA Research and Splicing
- Machine Learning in Bioinformatics
- Advances in Cucurbitaceae Research
- Plant Disease Resistance and Genetics
- Bioinformatics and Genomic Networks
- RNA modifications and cancer
- Image Retrieval and Classification Techniques
- Plant Reproductive Biology
- Advanced Image and Video Retrieval Techniques
- Data Visualization and Analytics
- Plant nutrient uptake and metabolism
- Cocoa and Sweet Potato Agronomy
- Fungal and yeast genetics research
- Microbial Metabolic Engineering and Bioproduction
- Congenital heart defects research
- Computational Drug Discovery Methods
- Genetic diversity and population structure
- Hair Growth and Disorders
- Genetic and Clinical Aspects of Sex Determination and Chromosomal Abnormalities
- Plant Gene Expression Analysis
- Lipid metabolism and biosynthesis
Beijing Normal University
2013-2024
State Key Laboratory of Earth Surface Processes and Resource Ecology
2020-2023
Zhangjun Fei and colleagues report the draft genome of a Chinese elite watermelon inbred line 97103 resequencing 20 diverse accessions that represent three subspecies Citrullus lunatus. Comparative genome-wide analyses identify extent genetic diversity population structure germplasm. Watermelon, lanatus, is an important cucurbit crop grown throughout world. Here we high-quality sequence east Asia cultivar (2n = 2× 22) containing 23,440 predicted protein-coding genes. genomics analysis...
Nuclei of arbuscular endomycorrhizal fungi have been described as highly diverse due to their asexual nature and absence a single cell stage with only one nucleus. This has raised fundamental questions concerning speciation, selection transmission the genetic make-up next generations. Although this concept become textbook knowledge, it is based on studying few loci, including 45S rDNA. To provide more comprehensive insight into makeup fungi, we applied de novo genome sequencing individual...
Accurate identification of orthologous genomic regions (OGRs) between two closely related genomes is crucial for the reliable detection changes, which range from small-scale changes (e.g., single nucleotide or small nucleotides) to large-scale structural changes. Although diverse OGRs inferred at different levels have been successfully applied address various biological questions, a limited number studies simultaneously integrated levels. Here, we report on new approach construct...
Background Explicit comparisons based on the semantic similarity of Gene Ontology terms provide a quantitative way to measure functional between gene products and are widely applied in large-scale genomic research via integration with other models. Previously, we presented an edge-based method, Relative Specificity Similarity (RSS), which takes global position relevant into account. However, metrics sensitive intrinsic structure GO simply consider at same level ontology be equally specific...
Bats can perceive the world by using a wide range of sensory systems, and some systems have become highly specialized, such as auditory perception. Among bat species, Old World leaf-nosed bats horseshoe (rhinolophoid bats) possess most sophisticated echolocation systems. Here, we reported whole-genome sequencing de novo assembles two rhinolophoid bats-the great (Hipposideros armiger) Chinese rufous (Rhinolophus sinicus). Comparative genomic analyses revealed adaptation perception in...
Abstract Although hybridization plays a large role in speciation, some unknown fraction of hybrid individuals never reproduces, instead remaining as genetic dead-ends. We investigated morphologically distinct and culturally important Chinese walnut, Juglans hopeiensis, suspected to have arisen from Persian walnut (J. regia) with Asian butternuts cathayensis, J. mandshurica, hybrids between cathayensis mandshurica). Based on 151 whole-genome sequences the relevant taxa, we discovered that all...
The identification, description and understanding of protein-protein networks are important in cell biology medicine, especially for the study system where focus concerns interaction biomolecules. Hubs bottlenecks refer to proteins a protein network. Until now, very little attention has been paid differentiate these two groups.By integrating human genome-wide variations across populations, we described differences between hubs this study. Our findings showed that similar interspecies,...
Alternative splicing (AS) is an important mechanism of posttranscriptional modification and dynamically regulates multiple physiological processes in plants, including fruit ripening. However, little known about alternative during development fleshy fruits.We studied the at immature ripe stages cucumber, melon, papaya peach. We found that 14.96-17.48% multiexon genes exhibited splicing. Intron retention was not always most frequent event, indicating pattern different developmental process...
Abstract Background Autopolyploidy is a valuable model for studying whole-genome duplication (WGD) without hybridization, yet little known about the genomic structural and functional changes that occur in autopolyploids after WGD. Cyclocarya paliurus (Juglandaceae) natural diploid–autotetraploid species. We generated an allele-aware autotetraploid genome, chimeric chromosome-level diploid resequencing data 106 individuals at average depth of 60 × per individual, along with 12 90 individual....
Alternative splicing (AS) is an important post-transcriptional process. It has been suggested that most AS events are subject to tissue-specific regulation. However, the global dynamics of in different tissues poorly explored. To analyse changes multiple tissues, we identified and constructed a comprehensive catalogue within each tissue based on genome-wide RNA-seq reads from ten cucumber. First, found 58% multi-exon genes underwent AS. We further obtained 565 with significantly more...
Abstract The mechanisms underlying the organization and evolution of telencephalic pallium are not yet clear.. To address this issue, we first performed comparative analysis genes critical for development ( Emx1/2 Pax6 ) subpallium Dlx2 Nkx1/2 among 500 vertebrate species. We found that these have no obvious variations in chromosomal duplication/loss, gene locus synteny or Darwinian selection. However, there is an additional fragment approximately 20 amino acids mammalian Emx1 a poly-(Ala)...
Alternative splicing (AS) plays a critical regulatory role in modulating transcriptome and proteome diversity. In particular, it increases the functional diversity of proteins. Recent genome-wide analysis AS using RNA-Seq has revealed that is highly pervasive plants. Furthermore, been suggested most events are subject to tissue-specific regulation.To reveal characteristics induced by events, database for exploring these needed, especially To address goals, we constructed annotated...
Polyploidy is ubiquitous and its consequences are complex variable. A change of ploidy level generally influences genetic diversity results in morphological, physiological ecological differences between cells or organisms with different levels. To avoid cumbersome experiments take advantage the less biased information provided by vast amounts genome sequencing data, computational tools for estimation urgently needed. Until now, although a few such have been developed, many aspects this...
Interacting proteins can contact with each other at three different levels: by a domain binding to another domain, short protein motif, or motif motif. In our previous work, we proposed an approach predict motif–motif sites for the yeast interactome contrasting high-quality positive interactions and non-interactions using simple statistical analysis. Here, extend this idea more comprehensively infer sites, including domain–domain, domain–motif, interactions. study, integrated 2854 that...
Domain-domain interactions are a critical type of the mechanisms mediating protein-protein (PPIs). For given protein domain, its ability to combine with distinct domains is usually referred as promiscuity or versatility. Interestingly, previous study has reported that domain's may reflect interact other in human proteins. In this work, promiscuous were first identified from yeast genome. Then, we sought determine what roles might play PPI network. Mapping onto proteins network revealed that,...
Alternative splicing is crucial for a wide range of biological processes. However, limited by the availability reference genomes, genome-wide patterns alternative remain unknown in most nonmodel organisms. We present an attention-based convolutional neural network model, DeepASmRNA, predicting events using only transcriptomic data. DeepASmRNA consists two parts: identification alternatively spliced transcripts and classification events, which outperformed state-of-the-art method, AStrap,...
Anthocyanin is the main pigment forming floral diversity. Several transcription factors that regulate expression of anthocyanin biosynthetic genes belong to R2R3-MYB family. Here we examined transcriptomes inflorescence buds Scutellaria species (skullcaps), identified R2R3-MYBs, and detected genetic signatures positive selection for adaptive divergence across rapidly evolving skullcaps. In buds, seven R2R3-MYBs were identified. MYB11 MYB16 be positively selected. The signature on MYB...
Alternative splicing (AS) is an essential post-transcriptional mechanism that regulates many biological processes. However, identifying comprehensive types of AS events without guidance from a reference genome still challenge. Here, we proposed novel method, MkcDBGAS, to identify all seven using transcriptome alone, genome. modeled by full-length transcripts human and Arabidopsis thaliana, consists three modules. In the first module, for time, uses colored de Bruijn graph with dynamic-...
Background: Although the cucumber reference genome and its annotation were published several years ago, functional of predicted genes, particularly protein-coding still requires further improvement. In general, accurately determining orthologous relationships between genes allows for better more robust assignments genes. As one most reliable strategies, determination collinearity information may facilitate orthology inferences among from multiple related genomes. Currently, identification...
Abstract Gene ontology annotation is known to be a very complicated multilabel classification task, and the hierarchical (HMC) approaches with local classifiers have been shown effective for task. In traditional HMC method, set of hierarchically organized simple are usually used, each which one level separately. this paper, we propose novel classifier implementing whole in deep convolution neural network (CNN) model multiple heads ends (MHME). The proposed MHME CNN consists three parts: body...
Protein evolution plays an important role in the of each genome. Because their functional nature, general, most parts or sites are differently constrained selectively, particularly by purifying selection. Most previous studies on protein considered individual proteins entirety compared protein-coding sequences with non-coding sequences. Less attention has been paid to different within a given To this end, based PfamA annotation all human proteins, sequence can be split into two parts:...
Since the large‐scale protein sequence data is available, applying deep neural networks to mine better features from sequences becomes possible. Eukaryotic subcellular localization prediction which makes a contribution in many biology process, has used automatic predicting methods. Moreover, gene ontology (GO) annotation been shown be helpful improving accuracy of localization. However, experimentally annotated proteins are not always available. On other hand, available for certain species...
Abstract Escherichia coli lab strains K-12 GM4792 Lac + and - carry opposite lactose markers, which are useful for distinguishing evolved lines as they produce different colored colonies. The two closely related chosen ancestors our ongoing studies of experimental evolution. Here, we describe the genome sequences, annotation, features . has a 4,622,342-bp long chromosome with 4,061 protein-coding genes 83 RNA genes. Similarly, consists 4,621,656-bp containing 4,043 74 Genome comparison...