Unsupervised reference-free inference reveals unrecognized regulated transcriptomic complexity in human single cells

DOI: 10.7554/elife.105979 Publication Date: 2025-05-22T14:30:37Z
ABSTRACT
Abstract Myriad mechanisms diversify the sequence content of eukaryotic transcripts at both DNA and RNA levels, leading to profound functional consequences. Examples this diversity include splicing V(D)J recombination. Currently, these are detected using fragmented bioinformatic tools that require predefining a form transcript diversification rely on alignment an incomplete reference genome, filtering out unaligned sequences, potentially crucial for novel discoveries. Here, we present SPLASH+, significantly advancing biological discovery possible with SPLASH, our recently introduced efficient, reference-free statistical approach. Integrating micro-assembly interpretation framework, SPLASH+ enables new discoveries including broad examples in single cells de novo, without need cell type metadata, which is impossible current algorithms. Applied 10,326 primary human across 19 tissues profiled SmartSeq2, discovers set histone regulators highly conserved intronic regions themselves subject complex regulation. Additionally, it reveals unreported heat shock protein HSP90AA1, as well centromeric expression, recombination, editing, repeat expansion, all missed by existing methods. enabling unprecedented breadth regulation through automated paradigm unbiased transcriptomic analysis.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (83)
CITATIONS (0)