- Genomics and Phylogenetic Studies
- RNA and protein synthesis mechanisms
- Chromosomal and Genetic Variations
- Machine Learning in Bioinformatics
- RNA Research and Splicing
- Gene expression and cancer classification
- Genetics, Bioinformatics, and Biomedical Research
- Genomics and Chromatin Dynamics
- Genetic Mapping and Diversity in Plants and Animals
- RNA modifications and cancer
- Scientific Computing and Data Management
- Bacterial Genetics and Biotechnology
- Algorithms and Data Compression
- Plant nutrient uptake and metabolism
- Plant Pathogenic Bacteria Studies
- Plant Disease Resistance and Genetics
- Plant-Microbe Interactions and Immunity
- Legume Nitrogen Fixing Symbiosis
- Photosynthetic Processes and Mechanisms
- DNA and Biological Computing
- Evolution and Genetic Dynamics
- Insect symbiosis and bacterial influences
- Plant Virus Research Studies
- Plant Molecular Biology Research
- Bioinformatics and Genomic Networks
Indiana University Bloomington
2012-2024
Indiana University
2016-2021
Iowa State University
2002-2011
University of South Dakota
2005-2011
Plant Gene Expression Center
2009
University of Missouri
2009
University of North Carolina at Chapel Hill
2009
Ames National Laboratory
2009
Institut thématique Génétique, génomique et bioinformatique
2004-2006
Universität Hamburg
2005
Transcription activator-like (TAL) effectors are repeat-containing proteins used by plant pathogenic bacteria to manipulate host gene expression. Repeats polymorphic and individually specify single nucleotides in the DNA target, with some degeneracy. A TAL effector-nucleotide binding code that links repeat type specified nucleotide enables prediction of genomic sites for customization use targeting, particular as custom transcription factors engineered regulation site-specific nucleases...
Alternative splicing (AS) has been extensively studied in mammalian systems but much less plants. Here we report AS events deduced from EST/cDNA analysis two model plants: Arabidopsis and rice. In Arabidopsis, 4,707 (21.8%) of the genes with evidence show 8,264 events. Approximately 56% these are intron retention (IntronR), only 8% exon skipping. rice, 6,568 (21.2%) expressed display 14,542 events, which 53.5% IntronR 13.8% The consistent high frequency suggests prevalence splice site...
We describe several protein sequence statistics designed to evaluate distinctive attributes of residue content and arrangement in primary structure. Considered are global compositional biases, local clustering different types (e.g., charged residues, hydrophobic Ser/Thr), long runs or uncharged periodic patterns, counts distribution homooligopeptides, unusual spacings between particular types. The computer program SAPS (statistical analysis sequences) calculates all the for any individual...
The nucleotide sequences of 30 factor-independent terminators transcription with RNA polymerase from E. coli have been compiled and analyzed. standard features - a stretch thymine residues preceding dyad symmetry are shared by most sequences, but there striking exceptions which indicate that these alone not sufficient to describe sites. In two thirds the 3'-half contains pentanucleotide CGGG (G/C) or close derivative; about one third TCTG derivative just downstream termination point. -box...
The STING (stimulator of interferon genes) protein can bind cyclic dinucleotides to activate the production type I interferons and inflammatory cytokines. be bacterial second messengers c-di-GMP c-di-AMP, 3’5’-3’5’ GMP-AMP (3’3’ cGAMP) produced by Vibrio cholerae metazoan messenger 2’5’-3’5’ Cyclic (2’3’ cGAMP). Analysis single nucleotide polymorphism (SNP) data from 1000 Genome Project revealed that R71H-G230A-R293Q (HAQ) occurs in 20.4%, R232H 13.7%, G230A-R293Q (AQ) 5.2%, R293Q 1.5% human...
PlantGDB ( http://www.plantgdb.org/ ) is a genomics database encompassing sequence data for green plants (Viridiplantae). provides annotated transcript assemblies > 100 plant species, with transcripts mapped to their cognate genomic context where available, integrated variety of analysis tools and web services. For 14 species emerging or complete genome sequence, PlantGDB's browsers (xGDB) serve as graphical interface viewing, evaluating annotating protein alignments chromosome bacterial...
The highly nonrandom character of genomic DNA can confound attempts at modeling sequence variation by standard stochastic processes (including random walk or fractal models). In particular, the mosaic consisting patches different composition fully account for apparent long-range correlations in DNA.
Statistical approaches help in the determination of significant configurations protein and nucleic acid sequence data. Three recent statistical methods are discussed: (i) score-based analysis that provides a means for characterizing anomalies local text evaluating comparisons; (ii) quantile distributions amino usage reveal general compositional biases proteins evolutionary relations; (iii) r -scan statistics can be applied to spacings markers.
ABSTRACT Xanthomonas is a large genus of bacteria that collectively cause disease on more than 300 plant species. The broad host range the contrasts with stringent and tissue specificity for individual species pathovars. Whole-genome sequences campestris pv. raphani strain 756C X. oryzae oryzicola BLS256, pathogens infect mesophyll leading models biology, Arabidopsis thaliana rice, respectively, were determined provided insight into genetic determinants specificity. Comparisons made genomes...
The concept of "words" in continuous languages devoid blanks is introduced and an operational definition words given. With this novel nucleotide sequences become object for linguistic analysis. typical word size the language found to be 3 5 (tri- pentamers). Different genomes have distinct vocabularies. Comparison these vocabularies can serve as a basis revealing functional evolutionary relatedness sequences.
Abstract A total of 74 small nuclear RNA (snRNA) genes and 395 encoding splicing-related proteins were identified in the Arabidopsis genome by sequence comparison motif searches, including previously elusive U4atac snRNA gene. Most have not been studied experimentally. Classification these detailed information on gene structure, alternative splicing, duplications phylogenetic relationships are made accessible as a comprehensive database Splicing Related Genes (ASRG) our website.
Abstract Comparative genomics of social insects has been intensely pursued in recent years with the goal providing insights into evolution behaviour and its underlying genomic epigenomic basis. However, comparative approach hampered by a paucity data on some most informative forms (e.g. incipiently primitively social) taxa (especially members wasp family Vespidae) for studying evolution. Here, we provide draft genome eusocial model insect Polistes dominula , accompanied analysis...
The maize (Zea mays) transposable element Dissociation (Ds) was mobilized for large-scale genome mutagenesis and to study its endogenous biology. Starting from a single donor locus on chromosome 10, over 1500 elements were distributed throughout the positioned physical map. Genetic strategies enrich both local unlinked insertions used distribute Ds insertions. Global, regional, insertion site trends examined. We show that transposed linked sites displayed nonuniform distribution genetic map...
Abstract *To whom correspondence should be addressed. Motivation: Supplementary cDNA or EST evidence is often decisive for discriminating between alternative gene predictions derived from computational sequence inspection by any of a number requisite programs. Without additional experimental effort, this approach must rely on the occurrence cognate ESTs under consideration in available, generally incomplete, collections given species. In some cases, particular exon assignments can supported...
Whole-genome sequencing is fundamental to understanding the genetic composition of an organism. Given size and complexity soybean genome, alternative approach targeted random-gene sequencing, which provides immediate productive method gene discovery. In this study, more than 120000 expressed sequence tags (ESTs) generated from 50 cDNA libraries were evaluated. These ESTs coalesced into 16928 contigs 17336 singletons. On average, each contig was composed 6 spanned 788 bases. The average...
We present here a compilation of prokaryotic transcription terminator sequences (ref. 1-152). The includes 49 independent terminators, 52 speculated 27 sites shown to function in vivo, and some 20 proven or rho-dependent terminators. In addition the well-known features terminators (dyad symmetry T-run), two consensus are found: CGGG(C/G) upstream TCTG downstream termination point. A subset collection sequence has been used construct computer algorithm locate by analysis.
Background Xanthomonas is a large genus of plant-associated and plant-pathogenic bacteria. Collectively, members cause diseases on over 392 plant species. Individually, they exhibit marked host- tissue-specificity. The determinants this specificity are unknown. Methodology/Principal Findings To assess potential contributions to tissue-specificity, pathogenesis-associated gene clusters were compared across genomes eight strains representing vascular or non-vascular pathogens rice, brassicas,...
Many plant disease resistance (R) genes function specifically in reaction to the presence of cognate effectors from a pathogen. Xanthomonas oryzae pathovar (Xoo) uses transcription activator-like (TALes) target specific rice for expression, thereby promoting host susceptibility bacterial blight. Here, we report molecular characterization Xa7, R gene TALes AvrXa7 and PthXo3, which major SWEET14. Xa7 was mapped unique 74-kb region. Gene expression analysis region revealed candidate that...
Abstract Expressed sequence tags (ESTs) currently encompass more entries in the public databases than any other form of data. Thus, EST data sets provide a vast resource for gene identification and expression profiling. We have mapped complete set 176,915 publicly available Arabidopsis sequences onto genome using GeneSeqer, spliced alignment program incorporating similarity splice site scoring. About 96% ESTs could be properly aligned with genomic locus, remaining deriving from organelle...
RecA protein sequences from 62 eubacterial sources were compared with one another and relative to archaebacterial RecA-like a number of eukaryotic sequences. Pairwise similarity scores determined by novel method based on significant segment pair alignment. The different species grouped the basis mutually high within groups consistency score ranges in comparison other groups. Following this protocol, gamma-proteobacteria can be subclassified into two major groups, those mostly vertebrate...
The maize mutation sh2-7527 was isolated in a conventional breeding program the 1970s. Although mutant contains foreign sequences within gene, is not attributable to an interchromosomal exchange or chromosomal inversion. Hence, caused by insertion. Sequences at two Sh2 borders have been scrambled mutated, suggesting that insertion catastrophic reshuffling of genome. large, least 12 kb, and highly repetitive maize. As judged hybridization, sorghum only one few copies element, whereas no...
Assembly of 73,000 expressed sequence tags (ESTs) representing multiple organs and developmental stages maize (Zea mays) identified approximately 22,000 tentative unique genes (TUGs) at the criterion 95% identity. Based on similarity, overlap between any two nine libraries with more than 3,000 ESTs ranged from 4% to 20% constituent TUGs. The most abundant were recovered only one or a minority libraries, 26 EST contigs had members all sets (presumably ubiquitously genes). For several...