- Chromosomal and Genetic Variations
- RNA and protein synthesis mechanisms
- RNA Research and Splicing
- Genomics and Phylogenetic Studies
- RNA modifications and cancer
- Plant Virus Research Studies
- Plant Disease Resistance and Genetics
- CRISPR and Genetic Engineering
- Glycosylation and Glycoproteins Research
- RNA Interference and Gene Delivery
- Advanced biosensing and bioanalysis techniques
- Carbohydrate Chemistry and Synthesis
- Protist diversity and phylogeny
- Molecular Biology Techniques and Applications
- Cancer-related molecular mechanisms research
- Viral Infections and Immunology Research
Massachusetts Institute of Technology
2020-2025
Cornell University
2017-2021
Abstract Many proteins regulate the expression of genes by binding to specific regions encoded in genome 1 . Here we introduce a new data set RNA elements human that are recognized RNA-binding (RBPs), generated as part Encyclopedia DNA Elements (ENCODE) project phase III. This class regulatory functions only when transcribed into RNA, they serve sites for RBPs control post-transcriptional processes such splicing, cleavage and polyadenylation, editing, localization, stability translation...
Abstract Spliceosomal introns are a ubiquitous feature of eukaryotic genes, whose presence often boosts the expression their host gene, phenomenon known as intron-mediated enhancement (IME). IME has been noted across diverse genes and organisms but remains mysterious in many respects. For example, how does intron sequence affect magnitude IME? In this study, we performed massively parallel reporter assay (MPRA) to assess effect varying on gene high-throughput manner, human cells, using tens...
Eukaryotic genomes are replete with repeated sequences in the form of transposable elements (TEs) dispersed across genome or as satellite arrays, large stretches tandemly sequences. Many satellites clearly originated TEs, but it is unclear how mobile genetic parasites can transform into megabase-sized tandem arrays. Comprehensive population genomic sampling needed to determine frequency and generative mechanisms at all stages from their initial formation subsequent expansion maintenance...
Mutation or deletion of the U1 snRNP-associated factor LUC7L2 is associated with myeloid neoplasms, and knockout alters cellular metabolism. Here, we show that members LUC7 protein family differentially regulate two major classes 5' splice sites (5'SS) broadly mRNA splicing in both human cell lines leukemias copy number variation. We describe distinctive 5'SS features exons impacted by three paralogs: LUC7L enhance "right-handed" stronger consensus matching on intron side near invariant /GU,...
Abstract Drosophila telomeres have been maintained by three families of active transposable elements (TEs), HeT-A, TAHRE, and TART, collectively referred to as HTTs, for tens millions years, which contrasts with an unusually high degree HTT interspecific variation. While the impacts conflict domestication are often invoked explain variation, unstable structures such that neutral mutational processes evolutionary tradeoffs may also drive evolution. We leveraged population genomic data analyze...
Messenger RNA isoform differences are predominantly driven by alternative first, internal, and last exons. Despite the importance of classifying exons to understand structure, few tools examine isoform-specific exon usage. We recently observed that transcription start sites often arise near internal exons, creating “hybrid” first/internal To systematically detect hybrid we built hybrid-internal-terminal (HIT) pipeline classify depending on their On basis splice junction reads in sequencing...
A Correction to this paper has been published: https://doi.org/10.1038/s41586-020-03067-w
Abstract RNA-binding proteins (RBPs) control the processing and function of cellular transcripts to effect post-transcriptional gene regulation. Sequence-specific binding RBPs millions synthetic RNAs has been probed in vitro by RNA Bind-n-Seq (RBNS). Here we describe RBPamp, a bio-physically-based model protein-RNA interactions associated algorithm that inferred affinity spectra 79 diverse human from RBNS data. RBPamp supports multiple motifs per RBP, models RBP concentration site...
Abstract Typical RNAseq experiments uncover hundreds of splicing changes, reflecting underlying changes in factor (SF) activity. Understanding transcriptomic variation terms SF activity requires elucidating the rules by which each impacts splicing. Here we present an interpretable regression model, KATMAP, models transcriptome-wide binding and resulting altered regulation. The regulatory principles KATMAP learns generalize to predict SF’s regulation at individual exons, with potential for...
Abstract Spliceosomal introns are a ubiquitous feature of eukaryotic genes, whose presence often boosts the expression their host gene, phenomenon known as intron-mediated enhancement (IME). IME has been noted across diverse genes and organisms, but remains mysterious in many respects. For example, how does intron sequence affect magnitude IME? In this study, we performed massively parallel reporter assay (MPRA) to assess effect varying on gene high-throughput manner, human cells, using tens...
Transposable elements (TEs) are self-replicating "genetic parasites" ubiquitous to eukaryotic genomes. In addition conflict between TEs and their host genomes, of the same family in competition with each other. They compete for genomic niches while experiencing regime copy-number selection. This suggests that among may favor emergence new variants can outcompete ancestral forms. To investigate sequence evolution TEs, we developed a method infer clades: collections share SNP represent...
Abstract Mutation or deletion of the U1 snRNP-associated factor LUC7L2 is associated with myeloid neoplasms, and knockout alters cellular metabolism. Here, we uncover that members LUC7 protein family differentially regulate two major classes 5’ splice sites (5’SS) broadly mRNA splicing in both human cell lines leukemias copy number variation. We describe distinctive 5’SS features exons impacted by three paralogs: LUC7L enhance “right-handed” stronger consensus matching on intron side...
ABSTRACT Drosophila telomeres have been maintained by three families of active transposable elements (TEs), HeT-A, TAHRE and TART , collectively referred to as HTTs, for tens millions years, which contrasts with an unusually high degree HTT interspecific variation. While the impacts conflict domestication are often invoked explain variation, unstable structures such that neutral mutational processes evolutionary tradeoffs may also drive evolution. We leveraged population genomic data analyze...
Abstract Eukaryotic genomes are replete with repeated sequences, in the form of transposable elements (TEs) dispersed across genome or as satellite arrays, large stretches tandemly sequence. Many satellites clearly originated TEs, but it is unclear how mobile genetic parasites can transform into megabase-sized tandem arrays. Comprehensive population genomic sampling needed to determine frequency and generative mechanisms at all stages from their initial formation subsequent expansion...
Abstract Alternative RNA processing is a major mechanism for diversifying the human transcriptome. Messenger isoform differences are predominantly driven by alternative first exons, cassette internal exons and last exons. Despite importance of classifying to understand structure, there lack tools look at isoform-specific exon usage using RNA-sequencing data. We recently observed that transcription start sites often arise near annotated creating “hybrid” can be used as both or To investigate...
Abstract Transposable elements (TEs) are self-replicating “genetic parasites” ubiquitous to eukaryotic genomes. In addition conflict between TEs and their host genomes, of the same family in competition with each other. They compete for genomic niches while experiencing regime copy-number selection. This suggests that among may favor emergence new variants can outcompete ancestral forms. To investigate sequence evolution TEs, we developed a method infer clades: collections share SNP...