- Genomics and Phylogenetic Studies
- Algorithms and Data Compression
- RNA and protein synthesis mechanisms
- Machine Learning in Bioinformatics
- SARS-CoV-2 and COVID-19 Research
- SARS-CoV-2 detection and testing
- Chromosomal and Genetic Variations
- Genome Rearrangement Algorithms
- Yeasts and Rust Fungi Studies
- Fungal and yeast genetics research
- Nanopore and Nanochannel Transport Studies
- Genomics and Chromatin Dynamics
- Bacteriophages and microbial interactions
- RNA Research and Splicing
- Fermentation and Sensory Analysis
- DNA and Biological Computing
- RNA modifications and cancer
- Animal Virus Infections Studies
- Gene expression and cancer classification
- Natural Language Processing Techniques
- Molecular Biology Techniques and Applications
- Advanced biosensing and bioanalysis techniques
- Fractal and DNA sequence analysis
- Blind Source Separation Techniques
- COVID-19 Clinical Research Studies
Comenius University Bratislava
2015-2024
Simon Fraser University
2022
University of California, Irvine
2012
Université de Montréal
2011
Cornell University
2007-2009
University of Waterloo
2001-2007
New York University
2007
The genome of the Southeast Asian great ape or orang-utan has been sequenced — specifically a draft assembly Sumatran female individual and short-read sequence data from five further Bornean orang-utan, Pongo abelii pygmaeus, respectively. Orang-utan species appear to have split around 400,000 years ago, more recent than most previous estimates suggested, resulting in an average Bornean–Sumatran nucleotide identity 99.68%. Structural evolution seems proceeded much slowly that other apes,...
Cystic echinococcosis (hydatid disease), caused by the tapeworm E. granulosus, is responsible for considerable human morbidity and mortality. This cosmopolitan disease difficult to diagnose, treat control. We present a draft genomic sequence worm comprising 151.6 Mb encoding 11,325 genes. Comparisons with genome sequences from other taxa show that granulosus has acquired spectrum of genes, including EgAgB family, whose products are secreted parasite interact redirect host immune responses....
We report the whole-genome sequence of common marmoset (Callithrix jacchus). The 2.26-Gb genome a female was assembled using Sanger read data (6×) and shotgun strategy. A first analysis has permitted comparison with genomes apes Old World monkeys identification specific features that might contribute to unique biology this diminutive primate, including genetic changes may influence body size, frequent twinning chimerism. observed positive selection in growth hormone/insulin-like factor genes...
The MinION device by Oxford Nanopore produces very long reads (reads over 100 kBp were reported); however it suffers from high sequencing error rate. We present an open-source DNA base caller based on deep recurrent neural networks and show that the accuracy of calling is much dependent underlying software can be improved considering modern machine learning methods. By employing carefully crafted networks, our tool significantly improves data R7.3 version platform compared to default...
Since its start, the Mammalian Gene Collection (MGC) has sought to provide at least one full-protein-coding sequence cDNA clone for every human and mouse gene with a RefSeq transcript, 6200 rat genes. The MGC cloning effort initially relied on random expressed tag screening of libraries. Here, we summarize our recent progress using directed RT-PCR DNA synthesis. now contains clones entire protein-coding 92% 89% genes curated (NM-accession) transcripts, 97% 96% transcripts that have or more...
Mitochondrial genome diversity in closely related species provides an excellent platform for investigation of chromosome architecture and its evolution by means comparative genomics. In this study, we determined the complete mitochondrial DNA sequences eight Candida analyzed their molecular architectures. Our survey revealed a puzzling variability architecture, including circular- linear-mapping multipartite linear forms. We propose that arrangement large inverted repeats identified these...
Significance During translation, ribosomes decode mRNAs in a sequential fashion. In this paper, we report the discovery of more than 80 translational bypassing elements (byps) 27–55 nt long mitochondrial protein-coding regions yeast Magnusiomyces capitatus. We demonstrate experimentally that byps are retained mRNA but not translated into protein. Byps somewhat resemble single bypass element bacteriophage T4 also display unique features. further discovered byp-like sequences other species,...
ABSTRACT Background The giant squid (Architeuthis dux; Steenstrup, 1857) is an enigmatic mollusc with a circumglobal distribution in the deep ocean, except high Arctic and Antarctic waters. elusiveness of species makes it difficult to study. Thus, having genome assembled for this deep-sea–dwelling will allow several pending evolutionary questions be unlocked. Findings We present draft assembly that includes 200 Gb Illumina reads, 4 Moleculo synthetic long 108 Chicago libraries, final size...
Abstract Motivation MinION is a portable nanopore sequencing device that can be easily operated in the field with features including monitoring of run progress and selective sequencing. To fully exploit these features, real-time base calling required. Up to date, this has only been achieved at cost high computing requirements pose limitations terms hardware availability common laptops energy consumption. Results We developed new caller DeepNano-coral for sequencing, which optimized on Coral...
SARS-CoV-2 mutants carrying the ∆H69/∆V70 deletion in amino-terminal domain of Spike protein emerged independently at least six lineages virus (namely, B.1.1.7, B.1.1.298, B.1.160, B.1.177, B.1.258, B.1.375). We analyzed samples collected from various regions Slovakia between November and December 2020 that were presumed to contain B.1.1.7 variant due drop-out gene target an RT-qPCR test caused by this deletion. Sequencing these revealed although some cases indeed confirmed as a substantial...
Abstract Background The fungus Marssonina brunnea is a causal pathogen of leaf spot that devastates poplar plantations by defoliating susceptible trees before normal fall drop. Results We sequence the genome M. with size 52 Mb assembled into 89 scaffolds, representing first sequenced Dermateaceae genome. By inoculating this onto hybrid clone, we investigate how interacts and co-evolves its host to colonize leaves. While handful virulence genes in , mostly from LysM family, are detected...
Abstract The emergence of a novel SARS-CoV-2 B.1.1.7 variant sparked global alarm due to increased transmissibility, mortality, and uncertainty about vaccine efficacy, thus accelerating efforts detect track the variant. Current approaches include sequencing RT-qPCR tests containing target assay that fails or results in reduced sensitivity towards Since many countries lack genomic surveillance programs failed assays unrelated variants similar mutations as B.1.1.7, we used allele-specific PCR,...
Pangenomes are becoming increasingly popular data structures for genomics analyses due to their ability compactly represent the genetic diversity within populations. Constructing a pangenome graph, however, is still time-consuming and expensive process. A promising approach construction consists of progressively augmenting graph with additional high-quality assemblies. Currently, there no augment using unassembled reads from newly sequenced samples that does not require align them genotype...
Abstract Short-read genome assemblies typically consist of many contigs variable lengths and their putative connections represented as an assembly graph. Assembly graphs produced by different tools from the same data may differ significantly, posing a challenge to for downstream processing tasks. One such task is plasmid binning, that identifying plasmids in sequenced bacterial isolates, which crucial monitoring spread antimicrobial resistance. When binning are applied tools, they exhibit...
Optimal spaced seeds were developed as a method to increase sensitivity of local alignment programs similar BLASTN. Such have been used before in the program PatternHunter, and given improved running time relative BLASTN genome–genome comparison. We study problem computing optimal for detecting homologous coding regions unannotated genomic sequences. By using well-chosen seeds, we are able improve sequence over that TBLASTX, while keeping runtime comparable identify good by first giving...
Short tandem repeats (STRs) are regions of a genome containing many consecutive copies the same short motif, possibly with small variations. Analysis STRs has clinical uses but is limited by technology mainly due to surpassing used read length. Nanopore sequencing, as one long-read sequencing technologies, produces very long reads, thus offering more possibilities study and analyze STRs. Basecalling nanopore reads however particularly unreliable in repeating regions, therefore direct...
Motivation: We present ExonHunter, a new and comprehensive gene finding system that outperforms existing systems features several ideas approaches. Our combines numerous sources of information (genomic sequences, expressed sequence tags protein databases related species) into finder based on hidden Markov model in novel systematic way. In our framework, various are as partial probabilistic statements about positions the their annotation. then combine these final prediction via quadratic...
A complete and accurate set of human protein-coding gene annotations is perhaps the single most important resource for genomic research after human-genome sequence itself, yet major catalogs remain incomplete imperfect. Here we describe a genome-wide effort, carried out as part Mammalian Gene Collection (MGC) project, to identify genes not in catalogs. Our approach was produce predictions by algorithms that rely on comparative data but do require direct cDNA evidence, then test predicted...
Abstract Motivation Oxford Nanopore MinION is a portable DNA sequencer that marketed as device can be deployed anywhere. Current base callers, however, require powerful GPU to analyze data produced by in real time, which hampers field applications. Results We have developed fast caller DeepNano-blitz stream from up two runs time using common laptop CPU (i7-7700HQ), with no requirements. The settings allow trading accuracy for speed and the results used run monitoring (i.e. sample...
Surveillance of the SARS-CoV-2 variants including quickly spreading mutants by rapid and near real-time sequencing viral genome provides an important tool for effective health policy decision making in ongoing COVID-19 pandemic. Here we evaluated PCR-tiling short (~400-bp) long (~2 ~2.5-kb) amplicons combined with nanopore on a MinION device analysis sequences. Analysis several runs demonstrated that using amplicon schemes outperforms original protocol based 400-bp amplicons. It also...
The chloroplasts of Euglena gracilis bounded by three membranes arose via secondary endosymbiosis a green alga in heterotrophic euglenozoan host. Many genes were transferred from symbiont to the host nucleus. A subset nuclear predominately symbiont, but also host, or other origin have obtained complex presequences required for chloroplast targeting. This study has revealed presence short introns (41–93 bp) either second half presequence-encoding regions shortly downstream them nine...