NFDI4DS | UHH-SEMS - Publication Details

Lucas Czech

ORCID: 0000-0002-1340-9644

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5005887668

Research Areas

Genomics and Phylogenetic Studies
Microbial Community Ecology and Physiology
Protist diversity and phylogeny
Environmental DNA in Biodiversity Studies
Genetic diversity and population structure
Evolution and Paleontology Studies
Species Distribution and Climate Change
Gene expression and cancer classification
Scientific Computing and Data Management
Evolution and Genetic Dynamics
Plant and animal studies
Ecology and Vegetation Dynamics Studies
Gut microbiota and health
SARS-CoV-2 and COVID-19 Research
Fractal and DNA sequence analysis
Parasitic Infections and Diagnostics
Video Analysis and Summarization
Music and Audio Processing
Natural Language Processing Techniques
Ancient and Medieval Archaeology Studies
Speech and dialogue systems
Marine and environmental studies
Plant Disease Resistance and Genetics
Speech Recognition and Synthesis
Linguistics and language evolution

Carnegie Department of Plant Biology
2020-2024

Carnegie Institution for Science
2020-2024

University of Copenhagen
2024

Heidelberg Institute for Theoretical Studies
2015-2021

EPA-ng: Massively Parallel Evolutionary Placement of Genetic Sequences

OPENALEX - Publications

Pierre Barbera Alexey M. Kozlov Lucas Czech Benoît Morel Diego Darriba and 2 more

Next generation sequencing (NGS) technologies have led to a ubiquity of molecular sequence data. This data avalanche is particularly challenging in metagenetics, which focuses on taxonomic identification sequences obtained from diverse microbial environments. Phylogenetic placement methods determine how these fit into an evolutionary context. Previous implementations phylogenetic algorithms, such as the algorithm (EPA) included RAxML, or PPLACER, are being increasingly used for this purpose....

10.1093/sysbio/syy054 article EN cc-by-nc Systematic Biology 2018-08-23

Parasites dominate hyperdiverse soil protist communities in Neotropical rainforests

OPENALEX - Publications

Frédéric Mahé Colomban de Vargas David Bass Lucas Czech Alexandros Stamatakis and 17 more

10.1038/s41559-017-0091 article EN Nature Ecology & Evolution 2017-03-20

Genesis and Gappa: processing, analyzing and visualizing phylogenetic (placement) data

OPENALEX - Publications

Lucas Czech Pierre Barbera Alexandros Stamatakis

We present genesis, a library for working with phylogenetic data, and gappa, an accompanying command-line tool conducting typical analyses on such data. The tools target trees placements, sequences, taxonomies other relevant data types, offer high-level simplicity as well low-level customizability, are computationally efficient, well-tested field-proven.

10.1093/bioinformatics/btaa070 article EN cc-by-nc Bioinformatics 2020-01-28

Genetic diversity loss in the Anthropocene

OPENALEX - Publications

Moisés Expósito‐Alonso Tom R. Booker Lucas Czech Lauren Gillespie Shannon Hateley and 10 more

Anthropogenic habitat loss and climate change are reducing species' geographic ranges, increasing extinction risk losses of genetic diversity. Although preserving diversity is key to maintaining adaptability, we lack predictive tools global estimates across ecosystems. We introduce a mathematical framework that bridges biodiversity theory population genetics understand the naturally occurring DNA mutations with decreasing habitat. By analyzing genomic variation 10,095 georeferenced...

10.1126/science.abn5642 article EN Science 2022-09-22

Phylogenetic Analysis of SARS-CoV-2 Data Is Difficult

OPENALEX - Publications

Benoît Morel Pierre Barbera Lucas Czech Ben Bettisworth Lukas Hübner and 8 more

Numerous studies covering some aspects of SARS-CoV-2 data analyses are being published on a daily basis, including regularly updated phylogeny nextstrain.org. Here, we review the difficulties inferring reliable phylogenies by example snapshot comprising quality-filtered subset 8,736 out all 16,453 virus sequences available May 5, 2020 from gisaid.org. We find that it is difficult to infer these due large number in conjunction with low mutations. further rooting inferred degree confidence...

10.1093/molbev/msaa314 article EN cc-by-nc Molecular Biology and Evolution 2020-12-03

Clarifying the Relationships between Microsporidia and Cryptomycota

OPENALEX - Publications

David Bass Lucas Czech Ben Williams Cédric Berney Micah Dunthorn and 4 more

Abstract Some protists with microsporidian‐like cell biological characters, including Mitosporidium , Paramicrosporidium and Nucleophaga have SSU rRNA gene sequences that are much less divergent than canonical Microsporidia. We analysed the phylogenetic placement environmental diversity of lineages group near base fungal radiation show they in a clade metchnikovellids microsporidians, to exclusion Rozella line what is currently known their morphology biology. These results scope...

10.1111/jeu.12519 article EN cc-by Journal of Eukaryotic Microbiology 2018-03-31

Long‐read metabarcoding of the eukaryotic rDNA operon to phylogenetically and taxonomically resolve environmental diversity

OPENALEX - Publications

Mahwash Jamy Rachel Foster Pierre Barbera Lucas Czech Alexey M. Kozlov and 5 more

High-throughput DNA metabarcoding of amplicon sizes below 500 bp has revolutionized the analysis environmental microbial diversity. However, these short regions contain limited phylogenetic signal, which makes it impractical to use in full inferences. This lesser resolution amplicons may be overcome by new long-read sequencing technologies. To test this idea, we amplified soil and used PacBio Circular Consensus Sequencing (CCS) obtain an ~4500-bp region spanning most eukaryotic small subunit...

10.1111/1755-0998.13117 article EN Molecular Ecology Resources 2019-11-09

Scalable methods for analyzing and visualizing phylogenetic placement of metagenomic samples

OPENALEX - Publications

Lucas Czech Alexandros Stamatakis

Background The exponential decrease in molecular sequencing cost generates unprecedented amounts of data. Hence, scalable methods to analyze these data are required. Phylogenetic (or Evolutionary) Placement identify the evolutionary provenance anonymous sequences with respect a given reference phylogeny. This increasingly popular method is deployed for scrutinizing metagenomic samples from environments such as water, soil, or human gut. Novel Here, we present novel and, more importantly,...

10.1371/journal.pone.0217050 article EN cc-by PLoS ONE 2019-05-28

Swarm v3: towards tera-scale amplicon clustering

OPENALEX - Publications

Frédéric Mahé Lucas Czech Alexandros Stamatakis Christopher Quince Colomban de Vargas and 2 more

Abstract Motivation Previously we presented swarm, an open-source amplicon clustering programme that produces fine-scale molecular operational taxonomic units (OTUs) are free of arbitrary global thresholds. Here, present swarm v3 to address issues contemporary datasets growing towards tera-byte sizes. Results When compared with previous versions, has modernized C++ source code, reduced memory footprint by up 50%, optimized CPU-usage and multithreading (more than 7 times faster default...

10.1093/bioinformatics/btab493 article EN cc-by Bioinformatics 2021-07-01

grenedalf: Population genetic statistics for the next generation of Pool sequencing

OPENALEX - Publications

Lucas Czech Jeffrey P. Spence Moisés Expósito‐Alonso

Abstract Summary Pool sequencing is an efficient method for capturing genome-wide allele frequencies from multiple individuals, with broad applications such as studying adaptation in Evolve-and-Resequence experiments, monitoring of genetic diversity wild populations, and genotype-to-phenotype mapping. Here, we present grenedalf, a command line tool written C++ that implements common population statistics θ, Tajima’s D, FST sequencing. It orders magnitude faster than current tools, focused on...

10.1093/bioinformatics/btae508 article EN cc-by Bioinformatics 2024-08-01

Wide-ranging consequences of priority effects governed by an overarching factor

OPENALEX - Publications

Callie R. Chappell Manpreet K. Dhami Mark C. Bitter Lucas Czech Sur Herrera Paredes and 10 more

Priority effects, where arrival order and initial relative abundance modulate local species interactions, can exert taxonomic, functional, evolutionary influences on ecological communities by driving them to alternative states. It remains unclear if these wide-ranging consequences of priority effects be explained systematically a common underlying factor. Here, we identify such factor in an empirical system. In series field laboratory studies, focus how pH affects nectar-colonizing microbes...

10.7554/elife.79647 article EN cc-by eLife 2022-10-27

Quartet-Based Computations of Internode Certainty Provide Robust Measures of Phylogenetic Incongruence

OPENALEX - Publications

Xiaofan Zhou Sarah Lutteropp Lucas Czech Alexandros Stamatakis Moritz von Looz and 1 more

Incongruence, or topological conflict, is prevalent in genome-scale data sets. Internode certainty (IC) and related measures were recently introduced to explicitly quantify the level of incongruence a given internal branch among set phylogenetic trees complement regular support (e.g., bootstrap, posterior probability) that instead assess statistical confidence inference. Since most phylogenomic studies contain partitions genes) with missing taxa IC scores stem from frequencies bipartitions...

10.1093/sysbio/syz058 article EN Systematic Biology 2019-08-28

Phylogenetic analysis of SARS-CoV-2 data is difficult

OPENALEX - Publications

Benoît Morel Pierre Barbera Lucas Czech Ben Bettisworth Lukas Hübner and 8 more

Numerous studies covering some aspects of SARS-CoV-2 data analyses are being published on a daily basis, including regularly updated phylogeny nextstrain.org . Here, we review the difficulties inferring reliable phylogenies by example snapshot comprising all virus sequences available May 5, 2020 from gisaid.org We find that it is difficult to infer these due large number in conjunction with low mutations. further rooting inferred degree confidence either via bat and pangolin outgroups or...

10.1101/2020.08.05.239046 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2020-08-06

Methods for automatic reference trees and multilevel phylogenetic placement

OPENALEX - Publications

Lucas Czech Pierre Barbera Alexandros Stamatakis

In most metagenomic sequencing studies, the initial analysis step consists in assessing evolutionary provenance of sequences. Phylogenetic (or Evolutionary) Placement methods can be employed to determine position sequences with respect a given reference phylogeny. These placement do however face certain limitations: The manual selection is labor-intensive; computational effort infer phylogenies substantially larger than for that rely on sequence similarity; number taxa phylogeny should small...

10.1093/bioinformatics/bty767 article EN cc-by Bioinformatics 2018-08-30

Population and adaptation history of 739Thlaspi arvensenatural accessions

OPENALEX - Publications

Xing Wu Ruth Epstein Maliheh Esfahanian Barsanti Gautam Marcus Griffiths and 55 more

Pennycress ( Thlaspi arvense ) is a promising intermediate oilseed crop, producing oil suitable for conversion to biofuels—including aviation fuels. While domestication efforts are ongoing, deeper understanding of the genetic architecture traits crucial informing future breeding efforts. Here, we conducted largest genomic and phenotypic survey pennycress date, analyzing 739 accessions collected across four continents. Leveraging whole-genome sequencing field-collected phenotypes,...

10.1101/2025.03.21.644658 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2025-03-25

grenepipe: a flexible, scalable and reproducible pipeline to automate variant calling from sequence reads

OPENALEX - Publications

Lucas Czech Moisés Expósito‐Alonso

We developed grenepipe, an all-in-one Snakemake workflow to streamline the data processing from raw high-throughput sequencing of individuals or populations genotype variant calls. Our pipeline offers a range popular software tools within single configuration file, automatically installs dependencies, is highly optimized for scalability in cluster environments and runs with command.grenepipe published under GPLv3 freely available at github.com/moiexpositoalonsolab/grenepipe.

10.1093/bioinformatics/btac600 article EN Bioinformatics 2022-09-02

Monitoring rapid evolution of plant populations at scale with Pool-Sequencing

OPENALEX - Publications

Lucas Czech Yunru Peng Jeffrey P. Spence Patricia L. M. Lang Tatiana Bellagio and 8 more

Abstract The change in allele frequencies within a population over time represents fundamental process of evolution. By monitoring frequencies, we can analyze the effects natural selection and genetic drift on populations. To efficiently track time-resolved change, large experimental or wild populations be sequenced as pools individuals sampled using high-throughput genome sequencing (called Evolve & Resequence approach, E&R). Here, present set experiments hundreds genotypes model...

10.1101/2022.02.02.477408 preprint EN cc-by-nc bioRxiv (Cold Spring Harbor Laboratory) 2022-02-04

EPA-ng: Massively Parallel Evolutionary Placement of Genetic Sequences

OPENALEX - Publications

Pierre Barbera Alexey M. Kozlov Lucas Czech Benoît Morel Diego Darriba and 2 more

Abstract Next Generation Sequencing (NGS) technologies have led to a ubiquity of molecular sequence data. This data avalanche is particularly challenging in metagenetics, which focuses on taxonomic identification sequences obtained from diverse microbial environments. To achieve this, phylogenetic placement methods determine how these fit into an evolutionary context. Previous implementations algorithms, such as the Evolutionary Placement Algorithm (EPA) included RAxML, or pplacer , are...

10.1101/291658 preprint EN cc-by-nd bioRxiv (Cold Spring Harbor Laboratory) 2018-03-29

Genesis and Gappa: Processing, Analyzing and Visualizing Phylogenetic (Placement) Data

OPENALEX - Publications

Lucas Czech Pierre Barbera Alexandros Stamatakis

Summary We present GENESIS, a library for working with phylogenetic data, and GAPPA, an accompanying command line tool conducting typical analyses on such data. The tools target trees placements, sequences, taxonomies, other relevant data types, offer high-level simplicity as well low-level customizability, are computationally efficient, well-tested, field-proven. Availability Implementation Both GENESIS GAPPA written in modern C++11, freely available under GPLv3 at...

10.1101/647958 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2019-05-24

Quartet-based computations of internode certainty provide accurate and robust measures of phylogenetic incongruence

OPENALEX - Publications

Xiaofan Zhou Sarah Lutteropp Lucas Czech Alexandros Stamatakis Moritz von Looz and 1 more

Abstract Incongruence, or topological conflict, is prevalent in genome-scale data sets but relatively few measures have been developed to quantify it. Internode Certainty (IC) and related were recently introduced explicitly the level of incongruence a given internode (or internal branch) among set phylogenetic trees complement regular branch support statistics assessing confidence inferred relationships. Since most phylogenomic studies contain partitions (e.g., genes) with missing taxa IC...

10.1101/168526 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2017-07-27

The Windblown: Possible Explanations for Dinophyte DNA in Forest Soils

OPENALEX - Publications

Marc Gottschling Lucas Czech Frédéric Mahé Sina M. Adl Micah Dunthorn

Abstract Dinophytes are widely distributed in marine‐ and fresh‐waters, but have yet to be conclusively documented terrestrial environments. Here, we evaluated the presence of these protists from an environmental DNA metabarcoding dataset Neotropical rainforest soils. Using a phylogenetic placement approach with reference alignment tree, showed that numerous sequencing reads were phylogenetically placed as dinophytes did not correlate taxonomic assignment, preference, nutritional mode, or...

10.1111/jeu.12833 article EN cc-by-nc Journal of Eukaryotic Microbiology 2020-11-06

Scalable methods for analyzing and visualizing phylogenetic placement of metagenomic samples

OPENALEX - Publications

Lucas Czech Alexandros Stamatakis

1 Abstract The exponential decrease in molecular sequencing cost generates unprecedented amounts of data. Hence, scalable methods to analyze these data are required. Phylogenetic (or Evolutionary) Placement identify the evolutionary provenance anonymous sequences with respect a given reference phylogeny. This increasingly popular method is deployed for scrutinizing metagenomic samples from environments such as water, soil, or human gut. Here, we present novel and, more importantly, highly...

10.1101/346353 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2018-06-14

Coming Soon ...