NFDI4DS | UHH-SEMS - Publication Details

Sam Kovaka

ORCID: 0000-0002-4835-8023

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5090348465

Research Areas

Genomics and Phylogenetic Studies
RNA modifications and cancer
RNA and protein synthesis mechanisms
Algorithms and Data Compression
Nanopore and Nanochannel Transport Studies
Mycorrhizal Fungi and Plant Interactions
Fungal Biology and Applications
Semantic Web and Ontologies
Molecular Biology Techniques and Applications
Data Management and Algorithms
Cancer-related molecular mechanisms research
Genetic Syndromes and Imprinting
Protist diversity and phylogeny
Gene expression and cancer classification
Fibroblast Growth Factor Research
Web Data Mining and Analysis
Circular RNAs in diseases
Cancer Genomics and Diagnostics
Epigenetics and DNA Methylation
Lichen and fungal ecology
Advanced biosensing and bioanalysis techniques
MicroRNA in disease regulation
Connective tissue disorders research

Johns Hopkins University
2018-2025

Clark University
2018

Transcriptome assembly from long-read RNA-seq alignments with StringTie2

OPENALEX - Publications

Sam Kovaka Aleksey V. Zimin Geo Pertea Roham Razaghi Steven L. Salzberg and 1 more

RNA sequencing using the latest single-molecule instruments produces reads that are thousands of nucleotides long. The ability to assemble these long can greatly improve sensitivity long-read analyses. Here we present StringTie2, a reference-guided transcriptome assembler works with both short and reads. StringTie2 includes new methods handle high error rate offers work full-length super-reads assembled from reads, which further improves quality short-read assemblies. is more accurate faster...

10.1186/s13059-019-1910-1 article EN cc-by Genome biology 2019-12-01

Targeted nanopore sequencing by real-time mapping of raw electrical signal with UNCALLED

OPENALEX - Publications

Sam Kovaka Yunfan Fan Bohan Ni Winston Timp Michael C. Schatz

10.1038/s41587-020-0731-9 article EN Nature Biotechnology 2020-11-30

Approaching complete genomes, transcriptomes and epi-omes with accurate long-read sequencing

OPENALEX - Publications

Sam Kovaka Shujun Ou Katharine M. Jenike Michael C. Schatz

10.1038/s41592-022-01716-8 article EN Nature Methods 2023-01-01

A common flanking variant is associated with enhanced stability of the FGF14-SCA27B repeat locus

OPENALEX - Publications

David Pellerin Giulia Gobbo Madeline Couse Egor Dolzhenko Sathiji Nageshwaran and 95 more

10.1038/s41588-024-01808-5 article EN Nature Genetics 2024-06-27

Uncalled4 improves nanopore DNA and RNA modification detection via fast and accurate signal alignment

OPENALEX - Publications

Sam Kovaka Paul W. Hook Katharine M. Jenike Vikram S. Shivakumar Luke B Morina and 3 more

Abstract Nanopore signal analysis enables detection of nucleotide modifications from native DNA and RNA sequencing, providing both accurate genetic/transcriptomic epigenetic information without additional library preparation. Presently, only a limited set can be directly basecalled (e.g. 5-methylcytosine), while most others require exploratory methods that often begin with alignment nanopore to reference. We present Uncalled4, toolkit for alignment, analysis, visualization. Uncalled4...

10.1101/2024.03.05.583511 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2024-03-11

Novel circular RNA circNF1 acts as a molecular sponge, promoting gastric cancer by absorbing miR-16

OPENALEX - Publications

Zhe Wang Ke Ma Stephanie Pitts Yulan Cheng Xi Liu and 7 more

Circular RNAs (circRNAs) are a new class of RNA involved in multiple human malignancies. However, limited information exists regarding the involvement circRNAs gastric carcinoma (GC). Therefore, we sought to identify novel circRNAs, their functions and mechanisms carcinogenesis. We analyzed next-generation sequencing data from GC tissues cell lines, identifying 75,201 candidate circRNAs. Among these, focused on one circRNA, circNF1 , which was upregulated lines. Loss- gain-of-function...

10.1530/erc-18-0478 article EN Endocrine Related Cancer 2018-12-21

Targeted nanopore sequencing by real-time mapping of raw electrical signal with UNCALLED

OPENALEX - Publications

Sam Kovaka Yunfan Fan Bohan Ni Winston Timp Michael C. Schatz

Abstract ReadUntil sequencing allows nanopore devices to selectively eject individual reads from the pore in real-time. This could enable purely computational targeted sequencing, however most mapping methods require basecalling, which is computationally intensive. Here we present UNCALLED ( github.com/skovaka/UNCALLED ), an open-source mapper that rapidly matches streaming current signals a reference sequence. probabilistically considers k-mers signal represent, and then prunes candidates...

10.1101/2020.02.03.931923 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2020-02-03

Sigmoni: classification of nanopore signal with a compressed pangenome index

OPENALEX - Publications

Vikram S Shivakumar Omar Ahmed Sam Kovaka Mohsen Zakeri Ben Langmead

Abstract Summary Improvements in nanopore sequencing necessitate efficient classification methods, including pre-filtering and adaptive sampling algorithms that enrich for reads of interest. Signal-based approaches circumvent the computational bottleneck basecalling. But past methods signal-based do not scale efficiently to large, repetitive references like pangenomes, limiting their utility partial or individual genomes. We introduce Sigmoni: a rapid, multiclass method based on r-index...

10.1093/bioinformatics/btae213 article EN cc-by Bioinformatics 2024-04-11

Mem-based pangenome indexing for k-mer queries

OPENALEX - Publications

Stephen Hwang Nathaniel K. Brown Omar Ahmed Katharine M. Jenike Sam Kovaka and 2 more

Abstract Pangenomes are growing in number and size, thanks to the prevalence of high-quality long-read assemblies. However, current methods for studying sequence composition conservation within pangenomes have limitations. Methods based on graph require a computationally expensive multiple-alignment step, which can leave out some variation. Indexes k -mers de Bruijn graphs limited answering questions at specific substring length . We present Maximal Exact Match Ordered (MEMO), pangenome...

10.1186/s13015-025-00272-y article EN cc-by Algorithms for Molecular Biology 2025-03-01

Uncalled4 improves nanopore DNA and RNA modification detection via fast and accurate signal alignment

OPENALEX - Publications

Sam Kovaka Paul W. Hook Katharine M. Jenike Vikram S. Shivakumar Luke B Morina and 3 more

Nanopore signal analysis enables detection of nucleotide modifications from native DNA and RNA sequencing, providing both accurate genetic or transcriptomic epigenetic information without additional library preparation. At present, only a limited set can be directly basecalled (for example, 5-methylcytosine), while most others require exploratory methods that often begin with alignment nanopore to reference. We present Uncalled4, toolkit for alignment, visualization. Uncalled4 features an...

10.1038/s41592-025-02631-4 article EN cc-by Nature Methods 2025-03-28

Genomics and Development ofLentinus tigrinus: A White-Rot Wood-Decaying Mushroom with Dimorphic Fruiting Bodies

OPENALEX - Publications

Baojun Wu Zhangyi Xu Alicia Knudson Alexis Carlson Naiyao Chen and 12 more

Lentinus tigrinus is a species of wood-decaying fungi (Polyporales) that has an agaricoid form (a gilled mushroom) and secotioid (puffball-like, with enclosed spore-bearing structures). Previous studies suggested the conferred by recessive allele single locus. We sequenced genomes one (Aga) strain (Sec) (39.53-39.88 Mb, 15,581-15,380 genes, respectively). mated Sec Aga monokaryons, genotyped progeny, performed bulked segregant analysis (BSA). also fruited three Sec/Sec Aga/Aga dikaryons,...

10.1093/gbe/evy246 article EN cc-by-nc Genome Biology and Evolution 2018-11-03

Pan-genomic matching statistics for targeted nanopore sequencing

OPENALEX - Publications

Omar Ahmed Massimiliano Rossi Sam Kovaka Michael C. Schatz Travis Gagie and 2 more

Nanopore sequencing is an increasingly powerful tool for genomics. Recently, computational advances have allowed nanopores to sequence in a targeted fashion; as the sequencer emits data, software can analyze data real time and signal eject "nontarget" DNA molecules. We present novel method called SPUMONI, which enables rapid accurate using efficient pan-genome indexes. SPUMONI uses compressed index rapidly generate exact or approximate matching statistics streaming fashion. When used target...

10.1016/j.isci.2021.102696 article EN cc-by iScience 2021-06-01

Transcriptome assembly from long-read RNA-seq alignments with StringTie2

OPENALEX - Publications

Sam Kovaka Aleksey V. Zimin Geo Pertea Roham Razaghi Steven L. Salzberg and 1 more

Abstract RNA sequencing using the latest single-molecule instruments produces reads that are thousands of nucleotides long. The ability to assemble these long can greatly improve sensitivity long-read analyses. Here we present StringTie2, a reference-guided transcriptome assembler works with both short and reads. StringTie2 includes new computational methods handle high error rate technology, which previous assemblers could not tolerate. It also offers work full-length super-reads assembled...

10.1101/694554 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2019-07-08

High resolution copy number inference in cancer using short-molecule nanopore sequencing

OPENALEX - Publications

Timour Baslan Sam Kovaka Fritz J. Sedlazeck Yanming Zhang Robert Wappel and 4 more

Genome copy number is an important source of genetic variation in health and disease. In cancer, Copy Number Alterations (CNAs) can be inferred from short-read sequencing data, enabling genomics-based precision oncology. Emerging Nanopore technologies offer the potential for broader clinical utility, example smaller hospitals, due to lower instrument cost, higher portability, ease use. Nonetheless, devices are limited retrievable reads/molecules compared platforms, limiting CNA inference...

10.1093/nar/gkab812 article EN cc-by Nucleic Acids Research 2021-09-09

Sigmoni: classification of nanopore signal with a compressed pangenome index

OPENALEX - Publications

Vikram S. Shivakumar Omar Ahmed Sam Kovaka Mohsen Zakeri Ben Langmead

Improvements in nanopore sequencing necessitate efficient classification methods, including pre-filtering and adaptive sampling algorithms that enrich for reads of interest. Signal-based approaches circumvent the computational bottleneck basecalling. But past methods signal-based do not scale efficiently to large, repetitive references like pangenomes, limiting their utility partial or individual genomes. We introduce Sigmoni: a rapid, multiclass method based on

10.1101/2023.08.15.553308 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2023-08-17

High resolution copy number inference in cancer using short-molecule nanopore sequencing

OPENALEX - Publications

Timour Baslan Sam Kovaka Fritz J. Sedlazeck Yanming Zhang Robert Wappel and 3 more

ABSTRACT Genome copy number is an important source of genetic variation in health and disease. In cancer, clinically actionable Copy Number Alterations (CNAs) can be inferred from short-read sequencing data, enabling genomics-based precision oncology. Emerging Nanopore technologies offer the potential for broader clinical utility, example smaller hospitals, due to lower instrument cost, higher portability, ease use. Nonetheless, devices are limited terms retrievable reads/molecules compared...

10.1101/2020.12.28.424602 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2020-12-29

MEM-based pangenome indexing for k-mer queries

OPENALEX - Publications

Stephen Hwang Nathaniel K. Brown Omar Ahmed Katharine M. Jenike Sam Kovaka and 2 more

Pangenomes are growing in number and size, thanks to the prevalence of high-quality long-read assemblies. However, current methods for studying sequence composition conservation within pangenomes have limitations. Methods based on graph require a computationally expensive multiple-alignment step, which can leave out some variation. Indexes

10.1101/2024.05.20.595044 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2024-05-22

MEM-based pangenome indexing for k-mer queries

OPENALEX - Publications

Stephen Hwang Nathaniel K. Brown Omar Ahmed Katharine M. Jenike Sam Kovaka and 2 more

<title>Abstract</title> Pangenomes are growing in number and size, thanks to the prevalence of high-quality long-read assemblies. However, current methods for studying sequence composition conservation within pangenomes have limitations. Methods based on graph require a computationally expensive multiple-alignment step, which can leave out some variation. Indexes k-mers de Bruijn graphs limited answering questions at specific substring length k. We present Maximal Exact Match Ordered (MEMO),...

10.21203/rs.3.rs-5363291/v1 preprint EN cc-by Research Square (Research Square) 2024-11-13

Pan-genomic Matching Statistics for Targeted Nanopore Sequencing

OPENALEX - Publications

Omar Ahmed Massimiliano Rossi Sam Kovaka Michael C. Schatz Travis Gagie and 2 more

Abstract Nanopore sequencing is an increasingly powerful tool for genomics. Recently, computational advances have allowed nanopores to sequence in a targeted fashion; as the sequencer emits data, software can analyze data real time and signal eject “non-target” DNA molecules. We present novel method called SPUMONI, which enables rapid accurate with help of efficient pangenome indexes. SPUMONI uses compressed index rapidly generate exact or approximate matching statistics (half-maximal...

10.1101/2021.03.23.436610 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2021-03-23

Coming Soon ...