Svetlana Karamycheva

ORCID: 0000-0001-9653-8910
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Genomics and Phylogenetic Studies
  • RNA and protein synthesis mechanisms
  • Plant biochemistry and biosynthesis
  • Biochemical and Structural Characterization
  • Bacterial Genetics and Biotechnology
  • Bacteriophages and microbial interactions
  • Gene expression and cancer classification
  • Bioinformatics and Genomic Networks
  • Parasitic Infections and Diagnostics
  • Microbial Community Ecology and Physiology
  • Molecular Biology Techniques and Applications
  • Plant nutrient uptake and metabolism
  • CRISPR and Genetic Engineering
  • Bacillus and Francisella bacterial research
  • Toxoplasma gondii Research Studies
  • Chromosomal and Genetic Variations
  • Potato Plant Research
  • RNA modifications and cancer
  • Machine Learning in Bioinformatics
  • Aquaculture disease management and microbiota
  • Malaria Research and Control
  • Bacterial Infections and Vaccines
  • Insect symbiosis and bacterial influences
  • Animal Genetics and Reproduction
  • Plant Disease Resistance and Genetics

National Center for Biotechnology Information
2019-2024

National Institutes of Health
2022-2024

United States National Library of Medicine
2022

J. Craig Venter Institute
2005-2016

Orion Genomics (United States)
2003

Donald Danforth Plant Science Center
2003

University of Georgia
2003

Purdue University West Lafayette
2003

TGICL is a pipeline for analysis of large Expressed Sequence Tags (EST) and mRNA databases in which the sequences are first clustered based on pairwise sequence similarity, then assembled by individual clusters (optionally with quality values) to produce longer, more complete consensus sequences. The system can run multi-CPU architectures including SMP PVM.

10.1093/bioinformatics/btg034 article EN Bioinformatics 2003-03-21

Abstract Toxoplasma gondii is among the most prevalent parasites worldwide, infecting many wild and domestic animals causing zoonotic infections in humans. T. differs substantially its broad distribution from closely related that typically have narrow, specialized host ranges. To elucidate genetic basis for these differences, we compared genomes of 62 globally distributed isolates to several coccidian parasites. Our findings reveal tandem amplification diversification secretory pathogenesis...

10.1038/ncomms10147 article EN cc-by Nature Communications 2016-01-07

The Arabidopsis Information Portal (https://www.araport.org) is a new online resource for plant biology research. It houses the thaliana genome sequence and associated annotation. was conceived as framework that allows research community to develop release 'modules' integrate, analyze visualize data may reside at remote sites. current implementation provides an indexed database of core genomic information. These are made available through feature-rich web applications provide search, mining,...

10.1093/nar/gku1200 article EN cc-by Nucleic Acids Research 2014-11-20

Approximately 80% of the maize genome comprises highly repetitive sequences interspersed with single-copy, gene-rich sequences, and standard sequencing strategies are not readily adaptable to this type genome. Methodologies that enrich for genic might more rapidly generate useful results from complex genomes. Equivalent numbers clones selected by techniques called methylation filtering High C0t selection were sequenced approximately 200,000 reads (approximately 132 megabases), which...

10.1126/science.1090047 article EN Science 2003-12-18

Abstract The cultivated potato (Solanum tuberosum) shares similar biology with other members of the Solanaceae, yet has features unique within family, such as modified stems (stolons) that develop into edible tubers. To better understand biology, we have undertaken a survey transcriptome using expressed sequence tags (ESTs) from diverse tissues. A total 61,940 ESTs were generated aerial tissues, below-ground and tissues challenged late-blight pathogen (Phytophthora infestans). Clustering...

10.1104/pp.013581 article EN PLANT PHYSIOLOGY 2003-02-01

The Clusters of Orthologous Genes (COG) database, originally created in 1997, has been updated to reflect the constantly growing collection completely sequenced prokaryotic genomes. This update increased genome coverage from 1309 2296 species, including 2103 bacteria and 193 archaea, most cases, with a single representative per genus. set covers all genera archaea that included organisms 'complete genomes' as NCBI databases November 2023. number COGs expanded 4877 4981, primarily by protein...

10.1093/nar/gkae983 article EN cc-by-nc Nucleic Acids Research 2024-11-04

Comparative genomics promises to rapidly accelerate the identification and functional classification of biologically important human genes. We developed TIGR Orthologous Gene Alignment (TOGA; 〈 http://www.tigr.org/tdb/toga/toga.shtml 〉) database provide a cross-reference between fully partially sequenced eukaryotic transcribed sequences. Starting with assembled expressed sequence tag (EST) gene sequences that comprise 28 Indices, we used high-stringency pair-wise searches reflexive,...

10.1101/gr.212002 article EN cc-by-nc Genome Research 2002-03-01

Medicago truncatula, a close relative of alfalfa (Medicago sativa), is model legume used for studying symbiotic nitrogen fixation, mycorrhizal interactions and genomics. J. Craig Venter Institute (JCVI; formerly TIGR) has been involved in M. truncatula genome sequencing annotation since 2002 maintained web-based resource providing data to the community this entire period. The website (http://www.MedicagoGenome.org) seen major updates past year, where it currently hosts latest version...

10.1093/pcp/pcu179 article EN Plant and Cell Physiology 2014-11-28

Microarray expression analysis is providing unprecedented data on gene in humans and mammalian model systems. Although such studies provide a tremendous resource for understanding human disease states, one of the significant challenges cross-referencing derived from different species, across diverse platforms, order to properly derive inferences regarding state. To address this problem, we have developed RESOURCERER, microarray-resource annotation cross-reference database built using...

10.1186/gb-2001-2-11-software0002 article EN cc-by Genome biology 2001-10-19

An essential component of functional genomics studies is the sequence DNA expressed in tissues interest. To provide a resource bovine-specific data and facilitate this powerful approach cattle research, four normalized cDNA libraries were produced arrayed for high-throughput sequencing. The made with RNA pooled from multiple to increase efficiency normalization maximize number independent genes which obtained. Target included those highest likelihood have impact on production parameters...

10.1101/gr.170101 article EN cc-by-nc Genome Research 2001-04-01

Expressed sequence tag (EST) projects have produced extremely valuable resources for identifying genes affecting phenotypes of interest. A large-scale EST sequencing project rainbow trout was initiated to identify and functionally annotate as many unique transcripts possible. Over 45,000 5′ ESTs were obtained by clones from a single normalized library constructed using mRNA six tissues. The production this data creation Gene Index eliminating redundancy providing annotation these sequences...

10.1159/000075773 article EN Cytogenetic and Genome Research 2003-01-01

Diverse and highly variable systems involved in biological conflicts self-versus-nonself discrimination are ubiquitous bacteria but much less studied archaea. We performed comprehensive comparative genomic analyses of the archaeal that share components with analogous bacterial propose an approach to identify new could be these functions. predict polymorphic toxin 141 genomes new, archaea-specific immunity protein families. These widely represented archaea predicted play major roles...

10.1128/mbio.00715-19 article EN cc-by mBio 2019-05-06

Abstract In silico identification of viral anti-CRISPR proteins (Acrs) has relied largely on the guilt-by-association method using known Acrs or associated (Acas) as bait. However, low number and limited spread characterized archaeal Aca hinders our ability to identify guilt-by-association. Here, based observation that few are transcribed immediately post infection, we hypothesize these genes, many other unidentified anti-defense genes (ADG), under control conserved regulatory sequences...

10.1038/s41467-024-48074-x article EN cc-by Nature Communications 2024-05-02

Abstract Background A low genetic diversity in Francisella tularensis has been documented. Current DNA based genotyping methods for typing F. offer a limited and varying degree of subspecies, clade strain level discrimination power. Whole genome sequencing is the most accurate reliable method to identify, type determine phylogenetic relationships among strains species. However, lower cost schemes are necessary order enable hundreds or even thousands isolates. Results We have generated...

10.1186/1471-2180-9-213 article EN cc-by BMC Microbiology 2009-10-07

The identification of microbial genes essential for survival as those with lethal knockout phenotype (LKP) is a common strategy functional interrogation genomes. However, interpretation the LKP complicated because substantial fraction this remains poorly functionally characterized. Furthermore, many can exhibit not their products perform cellular functions but activates toxicity other (conditionally genes). We analyzed sets two archaea,

10.1128/mbio.03092-23 article EN cc-by mBio 2024-01-08

Abstract Background Bacteria and archaea produce an enormous diversity of modified peptides that are involved in various forms inter-microbial conflicts or communication. A vast class such Ribosomally synthesized, Postranslationally Peptides (RiPPs), a major group RiPPs graspetides, so named after ATP-grasp ligases catalyze the formation lactam lactone linkages these peptides. The multiple proteins encoded respective Biosynthetic Gene Clusters (BGCs) their evolution have not been studied...

10.1186/s13062-022-00320-2 article EN cc-by Biology Direct 2022-03-21

The evolution of genomes in all life forms involves two distinct, dynamic types genomic changes: gene duplication (and loss) that shape families paralogous genes and extension contraction) low-complexity regions (LCR), which occurs through dynamics short repeats protein-coding genes. Although the roles each these events genome have been studied, their co-evolutionary is not thoroughly understood. Here, by analyzing a wide range from diverse bacteria archaea, we show LCR paralogy represent...

10.1073/pnas.2300154120 article EN cc-by-nc-nd Proceedings of the National Academy of Sciences 2023-04-10

MeSHer uses a simple statistical approach to identify biological concepts in the form of Medical Subject Headings (MeSH terms) obtained from PubMed database that are significantly overrepresented within identified gene set relative those associated with overall collection genes on underlying DNA microarray platform. As demonstration, we apply this lists acquired published study effects angiotensin II (Ang II) treatment cardiac expression and demonstrate can aid interpretation resulting...

10.1093/bioinformatics/bti503 article EN Bioinformatics 2005-05-26

ABSTRACT Screening of genomic and metagenomic databases for new variants CRISPR-Cas systems increasingly results in the discovery derived that do not seem to possess interference capacity are implicated functions distinct from adaptive immunity. We describe an extremely putative class 1 system is present many Halobacteria consists distant homologs Cas5 Cas7 protein along with uncharacterized conserved various nucleases. hypothesize that, although this lacks typical CRISPR effectors or a...

10.1093/femsle/fnz079 article EN public-domain FEMS Microbiology Letters 2019-04-01

Abstract Background Evolutionary rate is a key characteristic of gene families that linked to the functional importance respective genes as well specific biological functions proteins they encode. Accurate estimation evolutionary rates challenging task requires precise phylogenetic analysis. Here we present an easy estimate protein family level measure sequence variability based on alignment column homogeneity in multiple alignments sequences from Clade-Specific Clusters Orthologous Genes...

10.1186/s13062-022-00337-7 article EN cc-by Biology Direct 2022-08-30

Genomes of bacteria and archaea contain a much larger fraction unidirectional (serial) gene pairs than convergent or divergent pairs. Many the have short overlaps -4 nt -1 nt. As shown previously, translation genes in overlapping is tightly coupled. Two alternative models for fate post-termination ribosome predict either that very intergenic distances are essential translational coupling undissociated can scan through long regions, up to hundreds nucleotides. We aimed experimentally resolve...

10.3389/fmicb.2023.1291523 article EN cc-by Frontiers in Microbiology 2023-11-09

Background While the pneumococcal protein conjugate vaccines reduce incidence in invasive disease (IPD), serotype replacement remains a major concern. Thus, serotype-independent protection with targeting virulence genes, such as PspA, have been pursued. PspA is comprised of diverse clades that arose through recombination. Therefore, multi-locus sequence typing (MLST)-defined clones could conceivably include strains from multiple clades. As result, method needed which can both monitor...

10.1371/journal.pone.0015950 article EN cc-by PLoS ONE 2011-01-10
Coming Soon ...