- Genomics and Phylogenetic Studies
- RNA modifications and cancer
- RNA and protein synthesis mechanisms
- Algorithms and Data Compression
- Nanopore and Nanochannel Transport Studies
- Mycorrhizal Fungi and Plant Interactions
- Fungal Biology and Applications
- Semantic Web and Ontologies
- Molecular Biology Techniques and Applications
- Data Management and Algorithms
- Cancer-related molecular mechanisms research
- Genetic Syndromes and Imprinting
- Protist diversity and phylogeny
- Gene expression and cancer classification
- Fibroblast Growth Factor Research
- Web Data Mining and Analysis
- Circular RNAs in diseases
- Cancer Genomics and Diagnostics
- Epigenetics and DNA Methylation
- Lichen and fungal ecology
- Advanced biosensing and bioanalysis techniques
- MicroRNA in disease regulation
- Connective tissue disorders research
Johns Hopkins University
2018-2025
Clark University
2018
RNA sequencing using the latest single-molecule instruments produces reads that are thousands of nucleotides long. The ability to assemble these long can greatly improve sensitivity long-read analyses. Here we present StringTie2, a reference-guided transcriptome assembler works with both short and reads. StringTie2 includes new methods handle high error rate offers work full-length super-reads assembled from reads, which further improves quality short-read assemblies. is more accurate faster...
Abstract Nanopore signal analysis enables detection of nucleotide modifications from native DNA and RNA sequencing, providing both accurate genetic/transcriptomic epigenetic information without additional library preparation. Presently, only a limited set can be directly basecalled (e.g. 5-methylcytosine), while most others require exploratory methods that often begin with alignment nanopore to reference. We present Uncalled4, toolkit for alignment, analysis, visualization. Uncalled4...
Circular RNAs (circRNAs) are a new class of RNA involved in multiple human malignancies. However, limited information exists regarding the involvement circRNAs gastric carcinoma (GC). Therefore, we sought to identify novel circRNAs, their functions and mechanisms carcinogenesis. We analyzed next-generation sequencing data from GC tissues cell lines, identifying 75,201 candidate circRNAs. Among these, focused on one circRNA, circNF1 , which was upregulated lines. Loss- gain-of-function...
Abstract ReadUntil sequencing allows nanopore devices to selectively eject individual reads from the pore in real-time. This could enable purely computational targeted sequencing, however most mapping methods require basecalling, which is computationally intensive. Here we present UNCALLED ( github.com/skovaka/UNCALLED ), an open-source mapper that rapidly matches streaming current signals a reference sequence. probabilistically considers k-mers signal represent, and then prunes candidates...
Abstract Summary Improvements in nanopore sequencing necessitate efficient classification methods, including pre-filtering and adaptive sampling algorithms that enrich for reads of interest. Signal-based approaches circumvent the computational bottleneck basecalling. But past methods signal-based do not scale efficiently to large, repetitive references like pangenomes, limiting their utility partial or individual genomes. We introduce Sigmoni: a rapid, multiclass method based on r-index...
Abstract Pangenomes are growing in number and size, thanks to the prevalence of high-quality long-read assemblies. However, current methods for studying sequence composition conservation within pangenomes have limitations. Methods based on graph require a computationally expensive multiple-alignment step, which can leave out some variation. Indexes k -mers de Bruijn graphs limited answering questions at specific substring length . We present Maximal Exact Match Ordered (MEMO), pangenome...
Nanopore signal analysis enables detection of nucleotide modifications from native DNA and RNA sequencing, providing both accurate genetic or transcriptomic epigenetic information without additional library preparation. At present, only a limited set can be directly basecalled (for example, 5-methylcytosine), while most others require exploratory methods that often begin with alignment nanopore to reference. We present Uncalled4, toolkit for alignment, visualization. Uncalled4 features an...
Lentinus tigrinus is a species of wood-decaying fungi (Polyporales) that has an agaricoid form (a gilled mushroom) and secotioid (puffball-like, with enclosed spore-bearing structures). Previous studies suggested the conferred by recessive allele single locus. We sequenced genomes one (Aga) strain (Sec) (39.53-39.88 Mb, 15,581-15,380 genes, respectively). mated Sec Aga monokaryons, genotyped progeny, performed bulked segregant analysis (BSA). also fruited three Sec/Sec Aga/Aga dikaryons,...
Nanopore sequencing is an increasingly powerful tool for genomics. Recently, computational advances have allowed nanopores to sequence in a targeted fashion; as the sequencer emits data, software can analyze data real time and signal eject "nontarget" DNA molecules. We present novel method called SPUMONI, which enables rapid accurate using efficient pan-genome indexes. SPUMONI uses compressed index rapidly generate exact or approximate matching statistics streaming fashion. When used target...
Abstract RNA sequencing using the latest single-molecule instruments produces reads that are thousands of nucleotides long. The ability to assemble these long can greatly improve sensitivity long-read analyses. Here we present StringTie2, a reference-guided transcriptome assembler works with both short and reads. StringTie2 includes new computational methods handle high error rate technology, which previous assemblers could not tolerate. It also offers work full-length super-reads assembled...
Genome copy number is an important source of genetic variation in health and disease. In cancer, Copy Number Alterations (CNAs) can be inferred from short-read sequencing data, enabling genomics-based precision oncology. Emerging Nanopore technologies offer the potential for broader clinical utility, example smaller hospitals, due to lower instrument cost, higher portability, ease use. Nonetheless, devices are limited retrievable reads/molecules compared platforms, limiting CNA inference...
Improvements in nanopore sequencing necessitate efficient classification methods, including pre-filtering and adaptive sampling algorithms that enrich for reads of interest. Signal-based approaches circumvent the computational bottleneck basecalling. But past methods signal-based do not scale efficiently to large, repetitive references like pangenomes, limiting their utility partial or individual genomes. We introduce Sigmoni: a rapid, multiclass method based on
ABSTRACT Genome copy number is an important source of genetic variation in health and disease. In cancer, clinically actionable Copy Number Alterations (CNAs) can be inferred from short-read sequencing data, enabling genomics-based precision oncology. Emerging Nanopore technologies offer the potential for broader clinical utility, example smaller hospitals, due to lower instrument cost, higher portability, ease use. Nonetheless, devices are limited terms retrievable reads/molecules compared...
Pangenomes are growing in number and size, thanks to the prevalence of high-quality long-read assemblies. However, current methods for studying sequence composition conservation within pangenomes have limitations. Methods based on graph require a computationally expensive multiple-alignment step, which can leave out some variation. Indexes
<title>Abstract</title> Pangenomes are growing in number and size, thanks to the prevalence of high-quality long-read assemblies. However, current methods for studying sequence composition conservation within pangenomes have limitations. Methods based on graph require a computationally expensive multiple-alignment step, which can leave out some variation. Indexes k-mers de Bruijn graphs limited answering questions at specific substring length k. We present Maximal Exact Match Ordered (MEMO),...
Abstract Nanopore sequencing is an increasingly powerful tool for genomics. Recently, computational advances have allowed nanopores to sequence in a targeted fashion; as the sequencer emits data, software can analyze data real time and signal eject “non-target” DNA molecules. We present novel method called SPUMONI, which enables rapid accurate with help of efficient pangenome indexes. SPUMONI uses compressed index rapidly generate exact or approximate matching statistics (half-maximal...