- Genomics and Phylogenetic Studies
- RNA modifications and cancer
- RNA and protein synthesis mechanisms
- Gene Regulatory Network Analysis
- Single-cell and spatial transcriptomics
- Gut microbiota and health
- Machine Learning in Bioinformatics
- Cancer Genomics and Diagnostics
- Chromosomal and Genetic Variations
- RNA Research and Splicing
- Gene expression and cancer classification
- Molecular Biology Techniques and Applications
- Scientific Computing and Data Management
- Biomedical Text Mining and Ontologies
- Natural Language Processing Techniques
- Genomics and Chromatin Dynamics
- Retinoids in leukemia and cellular processes
- interferon and immune responses
- SARS-CoV-2 and COVID-19 Research
- Cancer-related molecular mechanisms research
- Advanced biosensing and bioanalysis techniques
- Viral Infections and Immunology Research
- Genomic variations and chromosomal abnormalities
- Algorithms and Data Compression
- Amino Acid Enzymes and Metabolism
The University of Texas Southwestern Medical Center
2018-2023
Southwestern Medical Center
2021-2023
Johns Hopkins Medicine
2012-2017
Johns Hopkins University
2012-2017
University of Maryland, College Park
2011-2013
Centrifuge is a novel microbial classification engine that enables rapid, accurate, and sensitive labeling of reads quantification species on desktop computers. The system uses an indexing scheme based the Burrows-Wheeler transform (BWT) Ferragina-Manzini (FM) index, optimized specifically for metagenomic problem. requires relatively small index (4.2 GB 4078 bacterial 200 archaeal genomes) classifies sequences at very high speed, allowing it to process millions from typical high-throughput...
Abstract TopHat-Fusion is an algorithm designed to discover transcripts representing fusion gene products, which result from the breakage and re-joining of two different chromosomes, or rearrangements within a chromosome. enhanced version TopHat, efficient program that aligns RNA-seq reads without relying on existing annotation. Because it independent annotation, can products deriving known genes, unknown genes unannotated splice variants genes. Using data breast prostate cancer cell lines,...
Sequencing technologies using nucleotide conversion techniques such as cytosine to thymine in bisulfite-seq and SLAM seq are powerful tools explore the chemical intricacies of cellular processes. To date, no one has developed a unified methodology for aligning converted sequences consolidating alignment these package. In this paper, we describe hierarchical indexing spliced transcripts-3 nucleotides (HISAT-3N), which can rapidly accurately align consisting any by leveraging index repeat...
Abstract Centrifuge is a novel microbial classification engine that enables rapid, accurate and sensitive labeling of reads quantification species on desktop computers. The system uses an indexing scheme based the Burrows-Wheeler transform (BWT) Ferragina-Manzini (FM) index, optimized specifically for metagenomic problem. requires relatively small index (4.2 GB 4,078 bacterial 200 archaeal genomes) classifies sequences at very high speed, allowing it to process millions from typical...
Abstract Rapid advances in next-generation sequencing technologies have dramatically changed our ability to perform genome-scale analyses of human genomes. The reference genome used for most genomic represents only a small number individuals, limiting its usefulness genotyping. We designed novel method, HISAT-genotype, representing and searching an expanded model the genome, which comprehensive catalogue known variants haplotypes is incorporated into data structure alignment. This strategy...
Abstract HISAT is a new, highly efficient system for alignment of sequences from RNA sequencing experiments that achieves dramatically faster performance than previous methods. uses new indexing scheme, hierarchical indexing, which based on the Burrows-Wheeler transform and Ferragina-Manzini (FM) index. Hierarchical employs two types indexes alignment: (1) whole-genome FM index to anchor each alignment, (2) numerous local very rapid extensions these alignments. HISAT’s human genome contains...
Algorithms for classifying chromosomes, like convolutional deep neural networks (CNNs), show promise to augment cytogeneticists' workflows; however, a critical limitation is their inability accurately classify various structural chromosomal abnormalities. In hematopathology, recurrent cytogenetic abnormalities herald diagnostic, prognostic and therapeutic implications, but are laborious expert cytogeneticists identify. Non-recurrent also occur frequently cancerous cells. Here, we demonstrate...
Abstract Objectives Our study aimed to develop a machine learning (ML) model accurately classify acute promyelocytic leukemia (APL) from other types of myeloid (other AML) using multicolor flow cytometry (MFC) data. Multicolor is used determine immunophenotypes that serve as disease signatures for diagnosis. Methods We data set MFC files 27 patients with APL and 41 AML, including those uncommon immunophenotypes. ML pipeline involved training graph neural network (GNN) output graph-level...
BACKGROUND Novel fusion transcripts (FTs) caused by chromosomal rearrangement are common factors in the development of cancers. In current study, authors used massively parallel RNA sequencing to identify new FTs colon METHODS (RNA‐Seq) and TopHat‐Fusion were The then investigated whether novel FT nuclear receptor subfamily 5, group A, member 2 (NR5A2)‐Kelch‐like family 29 (KLHL29FT) was transcribed from a genomic rearrangement. Next, expression NR5A2‐KLHL29FT measured quantitative real‐time...
Each novel SARS-CoV-2 variant renews concerns about decreased vaccine efficacy caused by evasion of induced neutralizing antibodies. However, accumulating epidemiological data show that while prevention infection varies, protection from severe disease and death remains high. Thus, immune responses beyond neutralization could contribute to efficacy. Polyclonal antibodies function through their Fab domains neutralize virus directly, Fc induce non-neutralizing host via engagement receptors on...
With the vast improvements in sequencing technologies and increased number of protocols, is being used to answer complex biological problems. Subsequently, analysis pipelines have become more time consuming complicated, usually requiring highly extensive prevalidation steps. Here, we present SeqWho, a program designed assess heuristically quality files reliably classify organism protocol type by using Random Forest classifiers trained on biases native k-mer frequencies repeat sequence identities.
<ns4:p>Introduction: There has long been a desire to understand, describe, and model gene regulatory networks controlling numerous biologically meaningful processes like differentiation. Despite many notable improvements models over the years, do not accurately capture subtle biological chemical characteristics of cell such as high-order chromatin domains chromosomes.</ns4:p><ns4:p> Methods: Topologically Associated Domains (TAD) are one these genomic regions that enriched for contacts...
Abstract There has long been a desire to understand, describe, and model gene regulatory networks controlling numerous biologically meaningful processes like differentiation. Despite many notable improvements models over the years, do not accurately capture subtle biological chemical characteristics of cell such as high-order chromatin domains chromosomes. Topologically Associated Domains (TAD) are one these genomic regions that enriched for contacts within themselves. Here we present...