NFDI4DS | UHH-SEMS - Publication Details

Discovery of target genes and pathways at GWAS loci by pooled single-cell CRISPR screens

OPENALEX - Publications

John Morris Christina M. Caragine Zharko Daniloski Júlia Domingo Timothy Barry and 11 more

Most variants associated with complex traits and diseases identified by genome-wide association studies (GWAS) map to noncoding regions of the genome unknown effects. Using ancestrally diverse, biobank-scale GWAS data, massively parallel CRISPR screens, single-cell transcriptomic proteomic sequencing, we discovered 124

10.1126/science.adh7699 article EN Science 2023-05-04

Multi-resolution localization of causal variants across the genome

OPENALEX - Publications

Matteo Sesia Eugene Katsevich Stephen Bates Emmanuel J. Candès Chiara Sabatti

Abstract In the statistical analysis of genome-wide association data, it is challenging to precisely localize variants that affect complex traits, due linkage disequilibrium, and maximize power while limiting spurious findings. Here we report on KnockoffZoom : a flexible method localizes causal at multiple resolutions by testing conditional associations genetic segments decreasing width, provably controlling false discovery rate. Our utilizes artificial genotypes as negative controls equally...

10.1038/s41467-020-14791-2 article EN cc-by Nature Communications 2020-02-27

Covariance Matrix Estimation for the Cryo-EM Heterogeneity Problem

OPENALEX - Publications

Eugene Katsevich Alexander Katsevich Amit Singer

In cryo-electron microscopy (cryo-EM), a microscope generates top view of sample randomly oriented copies molecule. The problem single particle reconstruction (SPR) from cryo-EM is to use the resulting set noisy two-dimensional projection images taken at unknown directions reconstruct three-dimensional (3D) structure some situations, molecule under examination exhibits structural variability, which poses fundamental challenge in SPR. heterogeneity task mapping space conformational states It...

10.1137/130935434 article EN SIAM Journal on Imaging Sciences 2015-01-01

Robust differential expression testing for single-cell CRISPR screens at low multiplicity of infection

OPENALEX - Publications

Timothy Barry Kaishu Mason Kathryn Roeder Eugene Katsevich

Abstract Single-cell CRISPR screens (perturb-seq) link genetic perturbations to phenotypic changes in individual cells. The most fundamental task perturb-seq analysis is test for association between a perturbation and count outcome, such as gene expression. We conduct the first-ever comprehensive benchmarking study of testing methods low multiplicity-of-infection (MOI) data, finding that existing produce excess false positives. an extensive empirical investigation identifying three core...

10.1186/s13059-024-03254-2 article EN cc-by Genome biology 2024-05-17

Multilayer knockoff filter: Controlled variable selection at multiple resolutions

OPENALEX - Publications

Eugene Katsevich Chiara Sabatti

We tackle the problem of selecting from among a large number variables those that are "important" for an outcome. consider situations where groups also interest. For example, each variable might be genetic polymorphism, and we want to study how trait depends on variability in genes, segments DNA typically contain multiple such polymorphisms. In this context, discover is relevant outcome implies discovering larger entity it represents important. To guarantee meaningful results with high...

10.1214/18-aoas1185 article EN other-oa The Annals of Applied Statistics 2019-03-01

GWAS-informed data integration and non-coding CRISPRi screen illuminate genetic etiology of bone mineral density

OPENALEX - Publications

Mitchell Conery James A. Pippin Yadav Wagley Bao Khanh Trang Matthew C. Pahl and 11 more

Over 1,100 independent signals have been identified with genome-wide association studies (GWAS) for bone mineral density (BMD), a key risk factor mortality-increasing fragility fractures; however, the effector gene(s) most remain unknown. Informed by variant-to-gene mapping strategy implicating 89 non-coding elements predicted to regulate osteoblast gene expression at BMD GWAS loci, we executed single-cell CRISPRi screen in human fetal 1.19 cells (hFOBs). The relevance of hFOBs was supported...

10.1101/2024.03.19.585778 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2024-03-20

Stability of the interior problem with polynomial attenuation in the region of interest

OPENALEX - Publications

Eugene Katsevich Alexander Katsevich Ge Wang

In many practical applications, it is desirable to solve the interior problem of tomography without requiring knowledge attenuation function fa on an open set within region interest (ROI). It was proved recently that has a unique solution if assumed be piecewise polynomial ROI. this paper, we tackle related question stability. well known lambda allows one stably recover locations and values jumps inside ROI from only local data. Hence, consider here case polynomial, rather than Assuming...

10.1088/0266-5611/28/6/065022 article EN Inverse Problems 2012-05-31

Fast and powerful conditional randomization testing via distillation

OPENALEX - Publications

Molei Liu Eugene Katsevich Lucas Janson Aaditya Ramdas

We consider the problem of conditional independence testing: given a response Y and covariates (X,Z), we test null hypothesis that Y⫫X∣Z. The randomization was recently proposed as way to use distributional information about X∣Z exactly nonasymptotically control Type-I error using any statistic in dimensionality without assuming anything Y∣(X,Z). This flexibility, principle, allows one derive powerful statistics from complex prediction algorithms while maintaining statistical validity. Yet...

10.1093/biomet/asab039 article EN Biometrika 2021-07-02

Simultaneous high-probability bounds on the false discovery proportion in structured, regression and online settings

OPENALEX - Publications

Eugene Katsevich Aaditya Ramdas

While traditional multiple testing procedures prohibit adaptive analysis choices made by users, Goeman and Solari (2011) proposed a simultaneous inference framework that allows users such flexibility while preserving high-probability bounds on the false discovery proportion (FDP) of chosen set. In this paper, we propose new class FDP bounds, tailored for nested sequences rejection sets. most existing are based closed using global null tests sorted p-values, additionally consider setting...

10.1214/19-aos1938 article EN The Annals of Statistics 2020-12-01

Discovery of target genes and pathways of blood trait loci using pooled CRISPR screens and single cell RNA sequencing

OPENALEX - Publications

John Morris Zharko Daniloski Júlia Domingo Timothy Barry Marcello Ziosi and 8 more

Abstract The majority of variants associated with complex traits and common diseases identified by genome-wide association studies (GWAS) map to noncoding regions the genome unknown regulatory effects in cis trans . By leveraging biobank-scale GWAS data, massively parallel CRISPR screens single cell transcriptome sequencing, we discovered target genes for blood trait loci. closest gene was often gene, but this not always case. We also -effects networks when encoded transcription factors,...

10.1101/2021.04.07.438882 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2021-04-08

Robust differential expression testing for single-cell CRISPR screens at low multiplicity of infection

OPENALEX - Publications

Timothy Barry Kaishu Mason Kathryn Roeder Eugene Katsevich

Single-cell CRISPR screens (perturb-seq) link genetic perturbations to phenotypic changes in individual cells. The most fundamental task perturb-seq analysis is test for association between a perturbation and count outcome, such as gene expression. We conduct the first-ever comprehensive benchmarking study of testing methods low multiplicity-of-infection (MOI) data, finding that existing produce excess false positives. an extensive empirical investigation identifying three core challenges:...

10.1101/2023.05.15.540875 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2023-05-15

Exploratory Gene Ontology Analysis with Interactive Visualization

OPENALEX - Publications

Junjie Zhu Qian Zhao Eugene Katsevich Chiara Sabatti

Abstract The Gene Ontology (GO) is a central resource for functional-genomics research. Scientists rely on the functional annotations in GO hypothesis generation and couple it with high-throughput biological data to enhance interpretation of results. At same time, sheer number concepts (>30,000) relationships (>70,000) presents challenge: can be difficult draw comprehensive picture how certain interest might relate rest ontology structure. Here we present new visualization strategies...

10.1038/s41598-019-42178-x article EN cc-by Scientific Reports 2019-05-24

Filtering the Rejection Set While Preserving False Discovery Rate Control

OPENALEX - Publications

Eugene Katsevich Chiara Sabatti Marina Bogomolov

Scientific hypotheses in a variety of applications have domain-specific structures, such as the tree structure international classification diseases (ICD), directed acyclic graph gene ontology (GO), or spatial genome-wide association studies. In context multiple testing, resulting relationships among can create redundancies rejections that hinder interpretability. This leads to practice filtering rejection sets obtained from testing procedures, which may turn invalidate their inferential...

10.1080/01621459.2021.1920958 article EN cc-by-nc-nd Journal of the American Statistical Association 2021-05-05

Fast and Powerful Conditional Randomization Testing via Distillation

OPENALEX - Publications

Molei Liu Eugene Katsevich Lucas Janson Aaditya Ramdas

We consider the problem of conditional independence testing: given a response Y and covariates (X,Z), we test null hypothesis that is independent X Z. The randomization (CRT) was recently proposed as way to use distributional information about X|Z exactly (non-asymptotically) control Type-I error using any statistic in dimensionality without assuming anything Y|(X,Z). This flexibility principle allows one derive powerful statistics from complex prediction algorithms while maintaining...

10.48550/arxiv.2006.03980 preprint EN other-oa arXiv (Cornell University) 2020-01-01

On the power of conditional independence testing under model-X

OPENALEX - Publications

Eugene Katsevich Aaditya Ramdas

For testing conditional independence (CI) of a response Y and predictor X given covariates Z, the model-X (MX) framework has been subject active methodological research, especially in context MX knockoffs their application to genome-wide association studies. In this paper, we study power CI tests, yielding quantitative insights into role machine learning providing evidence favor using likelihood-based statistics practice. Focusing on randomization test (CRT), find that its mode inference...

10.1214/22-ejs2085 article EN cc-by Electronic Journal of Statistics 2022-01-01

Conditional resampling improves calibration and sensitivity in single-cell CRISPR screen analysis

OPENALEX - Publications

Timothy Barry Xuran Wang John Morris Kathryn Roeder Eugene Katsevich

Single-cell CRISPR screens are the most promising biotechnology for mapping regulatory elements to their target genes at genome-wide scale. However, analysis of these presents significant statistical challenges. For example, technical factors like sequencing depth impact not only expression measurement but also perturbation detection, creating a confounding effect. We demonstrate on two recent high multiplicity infection single-cell how challenges cause calibration issues among existing...

10.1101/2020.08.13.250092 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2020-08-14

Pooled CRISPR screens with joint single-nucleus chromatin accessibility and transcriptome profiling

OPENALEX - Publications

Rachel Yan Alba Corman Lyla Katgara Wang Xiao Xinhe Xue and 11 more

10.1038/s41587-024-02475-x article EN Nature Biotechnology 2024-11-21

Large‐scale simultaneous inference under dependence

OPENALEX - Publications

Jinjin Tian Xu Chen Eugene Katsevich Jelle J. Goeman Aaditya Ramdas

Abstract Simultaneous inference allows for the exploration of data while deciding on criteria proclaiming discoveries. It was recently proved that all admissible post hoc methods true discoveries must employ closed testing. In this paper, we investigate efficient testing with local tests a special form: thresholding function sums test scores individual hypotheses. Under design, propose new statistic quantifies cost multiplicity adjustments, and develop fast (mostly linear‐time) algorithms...

10.1111/sjos.12614 article EN Scandinavian Journal of Statistics 2022-09-06

Multi-resolution localization of causal variants across the genome

OPENALEX - Publications

Matteo Sesia Eugene Katsevich Stephen Bates Emmanuel J. Candès Chiara Sabatti

Abstract We present KnockoffZoom , a flexible method for the genetic mapping of complex traits at multiple resolutions. localizes causal variants by testing conditional associations segments decreasing width while provably controlling false discovery rate using artificial genotypes as negative controls. Our is equally valid quantitative and binary phenotypes, making no assumptions about their architectures. Instead, we rely on well-established models linkage disequilibrium. demonstrate that...

10.1101/631390 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2019-05-08

Exploratory Gene Ontology Analysis with Interactive Visualization

OPENALEX - Publications

Junjie Zhu Qian Zhao Eugene Katsevich Chiara Sabatti

Abstract The Gene Ontology (GO) is a central resource for functional-genomics research. Scientists rely on the functional annotations in GO hypothesis generation and couple it with high-throughput biological data to enhance interpretation of results. At same time, sheer number concepts (>30,000) relationships (>70,000) presents challenge: can be difficult draw comprehensive picture how certain interest might relate rest ontology structure. Here we present new visualization strategies...

10.1101/436741 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2018-10-05

Multilayer Knockoff Filter: Controlled variable selection at multiple resolutions

OPENALEX - Publications

Eugene Katsevich Chiara Sabatti

We tackle the problem of selecting from among a large number variables those that are 'important' for an outcome. consider situations where groups also interest in their own right. For example, each variable might be genetic polymorphism and we want to study how trait depends on variability genes, segments DNA typically contain multiple such polymorphisms. Or, quantify various aspects functioning individual internet servers owned by company, interested assessing importance server as whole...

10.48550/arxiv.1706.09375 preprint EN other-oa arXiv (Cornell University) 2017-01-01

The saddlepoint approximation for averages of conditionally independent random variables

OPENALEX - Publications

Ziang Niu Jyotishka Ray Choudhury Eugene Katsevich

Motivated by the application of saddlepoint approximations to resampling-based statistical tests, we prove that a Lugananni-Rice style approximation for conditional tail probabilities averages conditionally independent random variables has vanishing relative error. We also provide general condition on existence and uniqueness solution corresponding equation. The results are valid under broad class distributions involving no restrictions smoothness distribution function. derived formula can...

10.48550/arxiv.2407.08915 preprint EN arXiv (Cornell University) 2024-07-11