- Genomics and Phylogenetic Studies
- Machine Learning in Bioinformatics
- RNA and protein synthesis mechanisms
- Cancer-related molecular mechanisms research
- RNA modifications and cancer
- Bioinformatics and Genomic Networks
- Gene expression and cancer classification
- Actinomycetales infections and treatment
- Sarcoidosis and Beryllium Toxicity Research
- Kruppel-like factors research
- RNA Research and Splicing
- Mycobacterium research and diagnosis
- Gut microbiota and health
- Ocular Diseases and Behçet’s Syndrome
- Diagnosis and treatment of tuberculosis
- Nutrition, Genetics, and Disease
- Tuberculosis Research and Epidemiology
- Pancreatic and Hepatic Oncology Research
- Genetic Syndromes and Imprinting
- Microbial Community Ecology and Physiology
Johns Hopkins University
2020-2024
University of Baltimore
2022
CHESS 3 represents an improved human gene catalog based on nearly 10,000 RNA-seq experiments across 54 body sites. It significantly improves current genome annotation by integrating the latest reference data and algorithms, machine learning techniques for noise filtering, new protein structure prediction methods. contains 41,356 genes, including 19,839 protein-coding genes 158,377 transcripts, with 14,863 transcripts not in other catalogs. includes all MANE at least one transcript most...
Intraductal papillary mucinous neoplasms (IPMNs) are non-invasive precursor lesions that can progress to invasive pancreatic cancer and classified as low-grade or high-grade based on the morphology of neoplastic epithelium. We aimed compare genetic alterations in regions same IPMN order identify molecular underlying progression.We performed multiregion whole exome sequencing tissue samples from 17 IPMNs with both dysplasia (76 regions, including 49 27 dysplasia). reconstructed phylogeny for...
Recently developed methods to predict three-dimensional protein structure with high accuracy have opened new avenues for genome and proteome research. We explore a hypothesis in annotation, namely whether computationally predicted structures can help identify which of multiple possible gene isoforms represents functional product. Guided by predictions, we evaluated over 230,000 human protein-coding genes assembled from 10,000 RNA sequencing experiments across many tissues. From this set...
Abstract The original CHESS database of human genes was assembled from nearly 10,000 RNA sequencing experiments in 53 body sites produced by the Genotype-Tissue Expression (GTEx) project, and then augmented with other databases to yield a comprehensive collection protein-coding noncoding transcripts. construction new 3 employed improved transcript assembly algorithms, machine learning classifier, protein structure predictions identify transcripts likely be functional eliminate those that...
<ns4:p><ns4:bold>Background</ns4:bold>: Metagenomic sequencing has the potential to identify a wide range of pathogens in human tissue samples. Sarcoidosis is complex disorder whose etiology remains unknown and for which variety infectious causes have been hypothesized. We sought conduct metagenomic on cases ocular periocular sarcoidosis, none them with previously identified causes.</ns4:p><ns4:p> <ns4:bold>Methods</ns4:bold>: Archival specimens 16 subjects biopsies tissues that were...
Abstract We explore a new hypothesis in genome annotation, namely whether computationally predicted protein structures can help to identify which of multiple possible gene isoforms represents functional product. Guided by structure predictions, we evaluated over 140,000 human protein-coding genes assembled from 10,000 RNA sequencing experiments across many tissues. illustrate our method with examples where provides guide function combination expression and evolutionary evidence....