- Single-cell and spatial transcriptomics
- Extracellular vesicles in disease
- Gene expression and cancer classification
- Genomics and Phylogenetic Studies
- Immune cells in cancer
- RNA and protein synthesis mechanisms
- Advanced Fluorescence Microscopy Techniques
- Cell Image Analysis Techniques
- RNA Research and Splicing
- RNA modifications and cancer
- Molecular Biology Techniques and Applications
- Advanced Proteomics Techniques and Applications
- Viral Infectious Diseases and Gene Expression in Insects
- Cell Adhesion Molecules Research
Somerville Hospital
2024
California Institute of Technology
2022-2023
Single-cell genomics analysis requires normalization of feature counts that stabilizes variance while accounting for variable cell sequencing depth. We discuss some the trade-offs present with current widely used methods, and analyze their performance on 526 single-cell RNA-seq datasets. The results lead us to recommend proportional fitting prior log transformation followed by an additional fitting.
Abstract Motivation Several genomic databases host data and metadata for an ever-growing collection of sequence datasets. While these have a shared hierarchical structure, there are no tools specifically designed to leverage it extraction. Results We present command-line tool, called ffq, querying user-generated from databases. Given accession or paper’s DOI, ffq efficiently fetches links raw in JSON format. ffq’s modularity simplicity make extensible any database exposing its programmatic...
ABSTRACT Translation of mRNAs containing premature termination codons (PTCs) results in truncated protein products with deleterious effects. Nonsense-mediated decay (NMD) is a surveillance pathway responsible for detecting PTC transcripts. Although the molecular mechanisms governing mRNA degradation have been extensively studied, fate nascent product remains largely uncharacterized. Here, we use fluorescent reporter system mammalian cells to reveal selective specifically targeting an NMD...
Abstract Cell atlas projects curate representative datasets, cell types, and marker genes for tissues across an organism. Despite their ubiquity, rely on duplicated manual effort to annotate types. The size of atlases coupled with a lack data-compatible tools make reprocessing analysis data near-impossible. To overcome these challenges, we present collection data, algorithms, automate cataloging analyzing types in organism, demonstrate its utility building human atlas.
Abstract We present a command-line tool, called ffq , for querying user-generated data and metadata from sequence databases. The code can be found here: https://github.com/pachterlab/ffq .
Abstract We describe an open source Human Commons Cell Atlas comprising 2.9 million cells across 27 tissues that can be easily updated and is structured to facilitate custom analyses. To showcase the flexibility of atlas, we demonstrate it used study isoforms genes at cell resolution. In particular, type specificity OAS1, which has been shown offer SARS-CoV-2 protection in certain individuals display higher expression p46 isoform. Using our commons atlas localize OAS1 p44b isoform testis,...
Spatial homogeneous regions (SHRs) in tissues are domains that with respect to cell type composition. We present a method for identifying SHRs using spatial transcriptomics data, and demonstrate it is efficient effective at finding wide variety of tissue types. The implemented tool called concordex, which relies on analysis k-nearest-neighbor (kNN) graphs. concordex also useful non-spatial can elucidate the extent concordance between partitions cells derived from clustering algorithms,...