Lucas Czech

ORCID: 0000-0002-1340-9644
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Genomics and Phylogenetic Studies
  • Microbial Community Ecology and Physiology
  • Protist diversity and phylogeny
  • Environmental DNA in Biodiversity Studies
  • Genetic diversity and population structure
  • Evolution and Paleontology Studies
  • Species Distribution and Climate Change
  • Gene expression and cancer classification
  • Scientific Computing and Data Management
  • Evolution and Genetic Dynamics
  • Plant and animal studies
  • Ecology and Vegetation Dynamics Studies
  • Gut microbiota and health
  • SARS-CoV-2 and COVID-19 Research
  • Fractal and DNA sequence analysis
  • Parasitic Infections and Diagnostics
  • Video Analysis and Summarization
  • Music and Audio Processing
  • Natural Language Processing Techniques
  • Ancient and Medieval Archaeology Studies
  • Speech and dialogue systems
  • Marine and environmental studies
  • Plant Disease Resistance and Genetics
  • Speech Recognition and Synthesis
  • Linguistics and language evolution

Carnegie Department of Plant Biology
2020-2024

Carnegie Institution for Science
2020-2024

University of Copenhagen
2024

Heidelberg Institute for Theoretical Studies
2015-2021

Next generation sequencing (NGS) technologies have led to a ubiquity of molecular sequence data. This data avalanche is particularly challenging in metagenetics, which focuses on taxonomic identification sequences obtained from diverse microbial environments. Phylogenetic placement methods determine how these fit into an evolutionary context. Previous implementations phylogenetic algorithms, such as the algorithm (EPA) included RAxML, or PPLACER, are being increasingly used for this purpose....

10.1093/sysbio/syy054 article EN cc-by-nc Systematic Biology 2018-08-23

We present genesis, a library for working with phylogenetic data, and gappa, an accompanying command-line tool conducting typical analyses on such data. The tools target trees placements, sequences, taxonomies other relevant data types, offer high-level simplicity as well low-level customizability, are computationally efficient, well-tested field-proven.

10.1093/bioinformatics/btaa070 article EN cc-by-nc Bioinformatics 2020-01-28

Anthropogenic habitat loss and climate change are reducing species' geographic ranges, increasing extinction risk losses of genetic diversity. Although preserving diversity is key to maintaining adaptability, we lack predictive tools global estimates across ecosystems. We introduce a mathematical framework that bridges biodiversity theory population genetics understand the naturally occurring DNA mutations with decreasing habitat. By analyzing genomic variation 10,095 georeferenced...

10.1126/science.abn5642 article EN Science 2022-09-22

Numerous studies covering some aspects of SARS-CoV-2 data analyses are being published on a daily basis, including regularly updated phylogeny nextstrain.org. Here, we review the difficulties inferring reliable phylogenies by example snapshot comprising quality-filtered subset 8,736 out all 16,453 virus sequences available May 5, 2020 from gisaid.org. We find that it is difficult to infer these due large number in conjunction with low mutations. further rooting inferred degree confidence...

10.1093/molbev/msaa314 article EN cc-by-nc Molecular Biology and Evolution 2020-12-03

Abstract Some protists with microsporidian‐like cell biological characters, including Mitosporidium , Paramicrosporidium and Nucleophaga have SSU rRNA gene sequences that are much less divergent than canonical Microsporidia. We analysed the phylogenetic placement environmental diversity of lineages group near base fungal radiation show they in a clade metchnikovellids microsporidians, to exclusion Rozella line what is currently known their morphology biology. These results scope...

10.1111/jeu.12519 article EN cc-by Journal of Eukaryotic Microbiology 2018-03-31

High-throughput DNA metabarcoding of amplicon sizes below 500 bp has revolutionized the analysis environmental microbial diversity. However, these short regions contain limited phylogenetic signal, which makes it impractical to use in full inferences. This lesser resolution amplicons may be overcome by new long-read sequencing technologies. To test this idea, we amplified soil and used PacBio Circular Consensus Sequencing (CCS) obtain an ~4500-bp region spanning most eukaryotic small subunit...

10.1111/1755-0998.13117 article EN Molecular Ecology Resources 2019-11-09

Background The exponential decrease in molecular sequencing cost generates unprecedented amounts of data. Hence, scalable methods to analyze these data are required. Phylogenetic (or Evolutionary) Placement identify the evolutionary provenance anonymous sequences with respect a given reference phylogeny. This increasingly popular method is deployed for scrutinizing metagenomic samples from environments such as water, soil, or human gut. Novel Here, we present novel and, more importantly,...

10.1371/journal.pone.0217050 article EN cc-by PLoS ONE 2019-05-28

Abstract Motivation Previously we presented swarm, an open-source amplicon clustering programme that produces fine-scale molecular operational taxonomic units (OTUs) are free of arbitrary global thresholds. Here, present swarm v3 to address issues contemporary datasets growing towards tera-byte sizes. Results When compared with previous versions, has modernized C++ source code, reduced memory footprint by up 50%, optimized CPU-usage and multithreading (more than 7 times faster default...

10.1093/bioinformatics/btab493 article EN cc-by Bioinformatics 2021-07-01

Abstract Summary Pool sequencing is an efficient method for capturing genome-wide allele frequencies from multiple individuals, with broad applications such as studying adaptation in Evolve-and-Resequence experiments, monitoring of genetic diversity wild populations, and genotype-to-phenotype mapping. Here, we present grenedalf, a command line tool written C++ that implements common population statistics θ, Tajima’s D, FST sequencing. It orders magnitude faster than current tools, focused on...

10.1093/bioinformatics/btae508 article EN cc-by Bioinformatics 2024-08-01

Priority effects, where arrival order and initial relative abundance modulate local species interactions, can exert taxonomic, functional, evolutionary influences on ecological communities by driving them to alternative states. It remains unclear if these wide-ranging consequences of priority effects be explained systematically a common underlying factor. Here, we identify such factor in an empirical system. In series field laboratory studies, focus how pH affects nectar-colonizing microbes...

10.7554/elife.79647 article EN cc-by eLife 2022-10-27

Incongruence, or topological conflict, is prevalent in genome-scale data sets. Internode certainty (IC) and related measures were recently introduced to explicitly quantify the level of incongruence a given internal branch among set phylogenetic trees complement regular support (e.g., bootstrap, posterior probability) that instead assess statistical confidence inference. Since most phylogenomic studies contain partitions genes) with missing taxa IC scores stem from frequencies bipartitions...

10.1093/sysbio/syz058 article EN Systematic Biology 2019-08-28

Numerous studies covering some aspects of SARS-CoV-2 data analyses are being published on a daily basis, including regularly updated phylogeny nextstrain.org . Here, we review the difficulties inferring reliable phylogenies by example snapshot comprising all virus sequences available May 5, 2020 from gisaid.org We find that it is difficult to infer these due large number in conjunction with low mutations. further rooting inferred degree confidence either via bat and pangolin outgroups or...

10.1101/2020.08.05.239046 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2020-08-06

In most metagenomic sequencing studies, the initial analysis step consists in assessing evolutionary provenance of sequences. Phylogenetic (or Evolutionary) Placement methods can be employed to determine position sequences with respect a given reference phylogeny. These placement do however face certain limitations: The manual selection is labor-intensive; computational effort infer phylogenies substantially larger than for that rely on sequence similarity; number taxa phylogeny should small...

10.1093/bioinformatics/bty767 article EN cc-by Bioinformatics 2018-08-30

Pennycress ( Thlaspi arvense ) is a promising intermediate oilseed crop, producing oil suitable for conversion to biofuels—including aviation fuels. While domestication efforts are ongoing, deeper understanding of the genetic architecture traits crucial informing future breeding efforts. Here, we conducted largest genomic and phenotypic survey pennycress date, analyzing 739 accessions collected across four continents. Leveraging whole-genome sequencing field-collected phenotypes,...

10.1101/2025.03.21.644658 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2025-03-25

We developed grenepipe, an all-in-one Snakemake workflow to streamline the data processing from raw high-throughput sequencing of individuals or populations genotype variant calls. Our pipeline offers a range popular software tools within single configuration file, automatically installs dependencies, is highly optimized for scalability in cluster environments and runs with command.grenepipe published under GPLv3 freely available at github.com/moiexpositoalonsolab/grenepipe.

10.1093/bioinformatics/btac600 article EN Bioinformatics 2022-09-02

Abstract The change in allele frequencies within a population over time represents fundamental process of evolution. By monitoring frequencies, we can analyze the effects natural selection and genetic drift on populations. To efficiently track time-resolved change, large experimental or wild populations be sequenced as pools individuals sampled using high-throughput genome sequencing (called Evolve & Resequence approach, E&R). Here, present set experiments hundreds genotypes model...

10.1101/2022.02.02.477408 preprint EN cc-by-nc bioRxiv (Cold Spring Harbor Laboratory) 2022-02-04

Abstract Next Generation Sequencing (NGS) technologies have led to a ubiquity of molecular sequence data. This data avalanche is particularly challenging in metagenetics, which focuses on taxonomic identification sequences obtained from diverse microbial environments. To achieve this, phylogenetic placement methods determine how these fit into an evolutionary context. Previous implementations algorithms, such as the Evolutionary Placement Algorithm (EPA) included RAxML, or pplacer , are...

10.1101/291658 preprint EN cc-by-nd bioRxiv (Cold Spring Harbor Laboratory) 2018-03-29

Summary We present GENESIS, a library for working with phylogenetic data, and GAPPA, an accompanying command line tool conducting typical analyses on such data. The tools target trees placements, sequences, taxonomies, other relevant data types, offer high-level simplicity as well low-level customizability, are computationally efficient, well-tested, field-proven. Availability Implementation Both GENESIS GAPPA written in modern C++11, freely available under GPLv3 at...

10.1101/647958 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2019-05-24

Abstract Incongruence, or topological conflict, is prevalent in genome-scale data sets but relatively few measures have been developed to quantify it. Internode Certainty (IC) and related were recently introduced explicitly the level of incongruence a given internode (or internal branch) among set phylogenetic trees complement regular branch support statistics assessing confidence inferred relationships. Since most phylogenomic studies contain partitions (e.g., genes) with missing taxa IC...

10.1101/168526 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2017-07-27

Abstract Dinophytes are widely distributed in marine‐ and fresh‐waters, but have yet to be conclusively documented terrestrial environments. Here, we evaluated the presence of these protists from an environmental DNA metabarcoding dataset Neotropical rainforest soils. Using a phylogenetic placement approach with reference alignment tree, showed that numerous sequencing reads were phylogenetically placed as dinophytes did not correlate taxonomic assignment, preference, nutritional mode, or...

10.1111/jeu.12833 article EN cc-by-nc Journal of Eukaryotic Microbiology 2020-11-06

1 Abstract The exponential decrease in molecular sequencing cost generates unprecedented amounts of data. Hence, scalable methods to analyze these data are required. Phylogenetic (or Evolutionary) Placement identify the evolutionary provenance anonymous sequences with respect a given reference phylogeny. This increasingly popular method is deployed for scrutinizing metagenomic samples from environments such as water, soil, or human gut. Here, we present novel and, more importantly, highly...

10.1101/346353 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2018-06-14
Coming Soon ...