NFDI4DS | UHH-SEMS - Publication Details

Alevin efficiently estimates accurate gene abundances from dscRNA-seq data

OPENALEX - Publications

Avi Srivastava Laraib Malik Tom Smith Ian Sudbery Rob Patro

We introduce alevin, a fast end-to-end pipeline to process droplet-based single-cell RNA sequencing data, performing cell barcode detection, read mapping, unique molecular identifier (UMI) deduplication, gene count estimation, and whitelisting. Alevin's approach UMI deduplication considers transcript-level constraints on the molecules from which UMIs may have arisen accounts for both gene-unique reads that multimap between genes. This addresses inherent bias in existing tools discard...

10.1186/s13059-019-1670-y article EN cc-by Genome biology 2019-03-27

Alignment and mapping methodology influence transcript abundance estimation

OPENALEX - Publications

Avi Srivastava Laraib Malik Hirak Sarkar Mohsen Zakeri Fatemeh Almodaresi and 4 more

Abstract Background The accuracy of transcript quantification using RNA-seq data depends on many factors, such as the choice alignment or mapping method and model being adopted. While has been shown to be important, considerably less attention given comparing effect various read approaches accuracy. Results We investigate influence in both simulated experimental data, well subsequent differential expression analysis. observe that, even when itself is held fixed, choosing a different...

10.1186/s13059-020-02151-8 article EN cc-by Genome biology 2020-09-07

Rich Chromatin Structure Prediction from Hi-C Data

OPENALEX - Publications

Laraib Malik Rob Patro

Recent studies involving the 3-dimensional conformation of chromatin have revealed important role it has to play in different processes within cell. These also led discovery densely interacting segments chromosome, called topologically associating domains. The accurate identification these domains from Hi-C interaction data is an interesting and computational problem for which numerous methods been proposed. Unfortunately, most existing algorithms designed identify assume that they are...

10.1109/tcbb.2018.2851200 article EN IEEE/ACM Transactions on Computational Biology and Bioinformatics 2018-06-28

Alignment and mapping methodology influence transcript abundance estimation

OPENALEX - Publications

Avi Srivastava Laraib Malik Hirak Sarkar Mohsen Zakeri Fatemeh Almodaresi and 4 more

Abstract Background The accuracy of transcript quantification using RNA-seq data depends on many factors, such as the choice alignment or mapping method and model being adopted. While has been shown to be important, considerably less attention given comparing effect various read approaches accuracy. Results We investigate influence in both simulated experimental data, well subsequent differential expression analysis. observe that, even when itself is held fixed, choosing a different...

10.1101/657874 preprint EN cc-by-nd bioRxiv (Cold Spring Harbor Laboratory) 2019-06-03

A Bayesian framework for inter-cellular information sharing improves dscRNA-seq quantification

OPENALEX - Publications

Avi Srivastava Laraib Malik Hirak Sarkar Rob Patro

Abstract Motivation Droplet-based single-cell RNA-seq (dscRNA-seq) data are being generated at an unprecedented pace, and the accurate estimation of gene-level abundances for each cell is a crucial first step in most dscRNA-seq analyses. When pre-processing raw to generate count matrix, care must be taken account potentially large number multi-mapping locations per read. The sparsity data, strong 3’ sampling bias, makes it difficult disambiguate cases where there no uniquely mapping read any...

10.1093/bioinformatics/btaa450 article EN cc-by-nc Bioinformatics 2020-06-23

Grouper: graph-based clustering and annotation for improved de novo transcriptome analysis

OPENALEX - Publications

Laraib Malik Fatemeh Almodaresi Rob Patro

De novo transcriptome analysis using RNA-seq offers a promising means to study gene expression in non-model organisms. Yet, the difficulty of assembly that contigs provided by assembler often represent fractured and incomplete view transcriptome, complicating downstream analysis. We introduce Grouper, new method for clustering from de assemblies are likely belong same transcripts genes; these groups can subsequently be analyzed more robustly. When with access genome related organism, Grouper...

10.1093/bioinformatics/bty378 article EN Bioinformatics 2018-05-03

RTA Occupancy of the Origin of Lytic Replication during Murine Gammaherpesvirus 68 Reactivation from B Cell Latency

OPENALEX - Publications

Alexis Santana Darby G. Oldenburg Varvara Kirillov Laraib Malik Qiwen Dong and 4 more

RTA, the viral Replication and Transcription Activator, is essential for rhadinovirus lytic gene expression upon de novo infection reactivation from latency. Lipopolysaccharide (LPS)/toll-like receptor (TLR)4 engagement enhances reactivation. We developed two new systems to examine interaction of RTA with host NF-kappaB (NF-κB) signaling during murine gammaherpesvirus 68 (MHV68) infection: a latent B cell line (HE-RIT) inducible RTA-Flag virus reactivation; recombinant (MHV68-RTA-Bio) that...

10.3390/pathogens6010009 article EN cc-by Pathogens 2017-02-16

Combinatorial Loss of the Enzymatic Activities of Viral Uracil-DNA Glycosylase and Viral dUTPase Impairs Murine Gammaherpesvirus Pathogenesis and Leads to Increased Recombination-Based Deletion in the Viral Genome

OPENALEX - Publications

Qiwen Dong Kyle R. Smith Darby G. Oldenburg Maxwell Shapiro William R. Schutt and 7 more

Misincorporation of uracil or spontaneous cytidine deamination is a common mutagenic insult to DNA. Herpesviruses encode viral uracil-DNA glycosylase (vUNG) and dUTPase (vDUT), each with enzymatic nonenzymatic functions. However, the coordinated roles these activities in gammaherpesvirus pathogenesis genomic stability have not been defined. In addition, potential compensation by host UNG has examined vivo The genetic tractability murine 68 (MHV68) system enabled us delineate contribution...

10.1128/mbio.01831-18 article EN cc-by mBio 2018-10-29

Towards Selective-Alignment

OPENALEX - Publications

Hirak Sarkar Mohsen Zakeri Laraib Malik Rob Patro

We introduce an algorithm for selectively aligning high-throughput sequencing reads to a transcriptome, with the goal of improving transcript-level quantification in difficult or adversarial scenarios. This attempts bridge gap between fast \nab algorithms and more traditional alignment procedures. adopt hybrid approach that is able produce accurate alignments while still retaining much efficiency non-alignment-based algorithms. To achieve this, we combine edit-distance-based verification...

10.1145/3233547.3233589 article EN 2018-08-15

Accurate, Fast and Lightweight Clustering of de novo Transcriptomes using Fragment Equivalence Classes

OPENALEX - Publications

Avi Srivastava Hirak Sarkar Laraib Malik Robert Patro

Motivation: De novo transcriptome assembly of non-model organisms is the first major step for many RNA-seq analysis tasks. Current methods de often report a large number contiguous sequences (contigs), which may be fractured and incomplete instead full-length transcripts. Dealing with such contigs can slow complicate downstream analysis. Results :We present method clustering from assemblies based upon relationships exposed by multi-mapping sequencing fragments. Specifically, we cast problem...

10.48550/arxiv.1604.03250 preprint EN other-oa arXiv (Cornell University) 2016-01-01

Rich chromatin structure prediction from Hi-C data

OPENALEX - Publications

Laraib Malik Rob Patro

ABSTRACT Recent studies involving the 3-dimensional conformation of chromatin have revealed important role it has to play in different processes within cell. These also led discovery densely interacting segments chromosome, called topologically associating domains. The accurate identification these domains from Hi-C interaction data is an interesting and computational problem for which numerous methods been proposed. Unfortunately, most existing algorithms designed identify assume that they...

10.1101/032953 preprint EN cc-by-nc bioRxiv (Cold Spring Harbor Laboratory) 2015-11-26

Rich Chromatin Structure Prediction from Hi-C Data

OPENALEX - Publications

Laraib Malik Rob Patro

Recent studies involving the 3-dimensional conformation of chromatin have revealed important role it has to play in different processes within cell. These also led discovery densely interacting segments chromosome, called topologically associating domains. The accurate identification these domains from Hi-C interaction data is an interesting and computational problem for which numerous methods been proposed. Unfortunately, most existing algorithms designed identify assume that they are...

10.1145/3107411.3107448 article EN 2017-08-20

Alevin efficiently estimates accurate gene abundances from dscRNA-seq data

OPENALEX - Publications

Avi Srivastava Laraib Malik Tom Smith Ian Sudbery Rob Patro

Abstract We introduce alevin, a fast end-to-end pipeline to process droplet-based single cell RNA sequencing data, which performs barcode detection, read mapping, unique molecular identifier deduplication, gene count estimation, and whitelisting. Alevin’s approach UMI deduplication accounts for both gene-unique reads that multimap between genes. This addresses the inherent bias in existing tools discard gene-ambiguous reads, improves accuracy of abundance estimates.

10.1101/335000 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2018-06-01

An Image-based Human Physical Activities Recognition in an Indoor Environment

OPENALEX - Publications

Farman Ullah Asif Iqbal Ajmal Khan Rida Khan Laraib Malik and 1 more

In this paper, we propose real-time image-based recognition of human activities from series images considering different actions performed in an indoor environment.The proposed activity recognition(IHAR)system can be utilized for assisting the life disabled persons, surveillance and tracking, computer interaction,and efficient resource utilization. The IHAR system consists closed-circuit television (CCTV) camera based image acquisitioning, various filtering enhancement, principle component...

10.1109/ictc49870.2020.9289314 article EN 2021 International Conference on Information and Communication Technology Convergence (ICTC) 2020-10-21

Towards selective-alignment: Bridging the accuracy gap between alignment-based and alignment-free transcript quantification

OPENALEX - Publications

Hirak Sarkar Mohsen Zakeri Laraib Malik Rob Patro

Abstract Motivation We introduce an algorithm for selectively aligning high-throughput sequencing reads to a transcriptome, with the goal of improving transcript-level quantification. This attempts bridge gap between fast “mapping” algorithms and more traditional alignment procedures. Results adopt hybrid approach that is able increase mapping accuracy while still retaining much efficiency algorithms. To achieve this, we new explores candidate search space high sensitivity as well collection...

10.1101/138800 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2017-05-17

Graph regularized, semi-supervised learning improves annotation of de novo transcriptomes

OPENALEX - Publications

Laraib Malik Shravya Thatipally Nikhil Junneti Rob Patro

Abstract We present a new method, GRASS, for improving an initial annotation of de novo transcriptomes. GRASS makes the shared-sequence relationships between assembled contigs explicit in form graph, and applies algorithm that performs label propagation to transfer annotations related modifies graph topology iteratively. demonstrate increases completeness accuracy annotation, allows improved differential analysis, is very efficient, typically taking 10s minutes.

10.1101/089417 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2016-11-25

A Bayesian framework for inter-cellular information sharing improves dscRNA-seq quantification

OPENALEX - Publications

Avi Srivastava Laraib Malik Hirak Sarkar Rob Patro

Abstract Motivation Droplet based single cell RNA-seq (dscRNA-seq) data is being generated at an unprecedented pace, and the accurate estimation of gene level abundances for each a crucial first step in most dscRNA-seq analyses. When preprocessing raw to generate count matrix, care must be taken account potentially large number multi-mapping locations per read. The sparsity data, strong 3’ sampling bias, makes it difficult disambiguate cases where there no uniquely mapping read any candidate...

10.1101/2020.04.10.035899 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2020-04-13