- Genomics and Phylogenetic Studies
- RNA and protein synthesis mechanisms
- Chromosomal and Genetic Variations
- Protist diversity and phylogeny
- Genomics and Chromatin Dynamics
- Photosynthetic Processes and Mechanisms
- Bioinformatics and Genomic Networks
- Single-cell and spatial transcriptomics
- Cell Image Analysis Techniques
- Gene Regulatory Network Analysis
- Plant Molecular Biology Research
- Biomedical Text Mining and Ontologies
- Genetic diversity and population structure
- Soybean genetics and cultivation
- CRISPR and Genetic Engineering
- Evolution and Genetic Dynamics
Max Planck Institute for Biology
2023-2025
Abstract Pangenome graphs can represent all variation between multiple reference genomes, but current approaches to build them exclude complex sequences or are based upon a single reference. In response, we developed the PanGenome Graph Builder (PGGB), pipeline for constructing pangenome without bias exclusion. PGGB uses all-to-all alignments graph in which identify variation, measure conservation, detect recombination events, and infer phylogenetic relationships.
Our view of genetic polymorphism is shaped by methods that provide a limited and reference-biased picture. Long-read sequencing technologies, which are starting to nearly complete genome sequences for population samples, should solve the problem—except characterizing making sense non-SNP variation difficult even with perfect sequence data. Here, we analyze 27 genomes Arabidopsis thaliana in an attempt address these issues, illustrate what can be learned analyzing whole-genome data unbiased...
Abstract Plant cells have two major organelles with their own genomes: chloroplasts and mitochondria. While chloroplast genomes tend to be structurally conserved, the mitochondrial of plants, which are much larger than those animals, characterized by complex structural variation. We introduce TIPPo, a user-friendly, reference-free assembly tool that uses PacBio high-fidelity long-read data does not rely on from related species or nuclear genome information for organellar genomes. TIPPo...
Chloroplasts and mitochondria are the primary sites for photosynthesis respiration, each harboring its own unique genome. Although organellar genomes considerably smaller compared to nuclear genome, they nonetheless essential survival of organism. A common feature many chloroplast mitochondrial is presence large repeated sequences longer than 1 kb. These can be either in inverted or direct orientation, recombination between them leads structural heteroplasmy. To understand intraspecific...
Plant cells have two major organelles with their own genomes: chloroplasts and mitochondria. While chloroplast genomes tend to be structurally conserved, the mitochondrial of plants, which are much larger than those animals, characterized by complex structural variation. We introduce TIPP_plastid, a user-friendly, reference-free assembly tool that uses PacBio high-fidelity (HiFi) long-read data does not rely on from related species or nuclear genome information for organellar genomes....
Abstract Motivation: As genome graphs are powerful data structures for representing the genetic diversity within populations, they can help identify genomic variations that traditional linear references miss, but their complexity and size makes analysis of challenging. We sought to develop a graph tool helps these analyses become more accessible by addressing limitations existing tools. Specifically, we improve scalability user-friendliness, provide many new statistics evaluation. Results:...
Variation graphs offer superior representation of genomic diversity compared to traditional linear reference genomes, capturing complex features that are otherwise inaccessible analysis. It seems self-evident integrating these with genome-wide association studies (GWAS) should enable more comprehensive understanding genetic landscapes, potentially uncovering novel associations between variations and traits. This approach takes full advantage rich information, thereby providing deeper...
Abstract Arabidopsis thaliana was the first plant for which a high-quality genome sequence became available. The publication of reference almost 25 years ago already accompanied by genome-wide data on polymorphisms in another accession, or naturally occurring strain. Since then, inventories diversity have been generated at increasingly precise levels. High-density genotype A. , including those from 1001 Genomes Project, were key to demonstrating enormous power GWAS inbred populations wild...
Abstract Motivation As genome graphs are powerful data structures for representing the genetic diversity within populations, they can help identify genomic variations that traditional linear references miss, but their complexity and size makes analysis of challenging. We sought to develop a graph tool helps these analyses become more accessible by addressing limitations existing tools. Specifically, we improve scalability user-friendliness, provide many new statistics tailored variation...