NFDI4DS | UHH-SEMS - Publication Details

Richard J. Samworth

ORCID: 0000-0003-2426-4679

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5003721823

Research Areas

Statistical Methods and Inference
Bayesian Methods and Mixture Models
Advanced Statistical Methods and Models
Statistical Methods and Bayesian Inference
Sparse and Compressive Sensing Techniques
Machine Learning and Algorithms
Markov Chains and Monte Carlo Methods
Blind Source Separation Techniques
Random Matrices and Applications
Neural Networks and Applications
Face and Expression Recognition
Advanced Statistical Process Monitoring
Bayesian Modeling and Causal Inference
Statistical Methods in Clinical Trials
Advanced Causal Inference Techniques
Control Systems and Identification
SARS-CoV-2 and COVID-19 Research
Gaussian Processes and Bayesian Inference
Gene expression and cancer classification
Domain Adaptation and Few-Shot Learning
Complex Systems and Time Series Analysis
Statistical and numerical algorithms
Machine Learning and Data Classification
Multiple Myeloma Research and Treatments
Sensory Analysis and Statistical Methods

University of Cambridge
2015-2024

University of Sheffield
2024

University of Edinburgh
2020

University of Chicago
2019

University of Wisconsin–Madison
2018

University of Michigan
2018

Columbia University
2018

Sungshin Women's University
2018

Statistical Service
2018

California Institute of Technology
2016

A useful variant of the Davis–Kahan theorem for statisticians

OPENALEX - Publications

Yi Yu Tengyao Wang Richard J. Samworth

Journal Article A useful variant of the Davis–Kahan theorem for statisticians Get access Y. Yu, Yu Statistical Laboratory, University Cambridge, Wilberforce Road, Cambridge CB3 0WB, U.K., y.yu@statslab.cam.ac.ukt.wang@statslab.cam.ac.ukr.samworth@statslab.cam.ac.uk Search other works by this author on: Oxford Academic Google Scholar T. Wang, Wang R. J. Samworth Biometrika, Volume 102, Issue 2, June 2015, Pages 315–323, https://doi.org/10.1093/biomet/asv008 Published: 28 April 2014 history...

10.1093/biomet/asv008 article EN Biometrika 2015-04-28

Optimal weighted nearest neighbour classifiers

OPENALEX - Publications

Richard J. Samworth

We derive an asymptotic expansion for the excess risk (regret) of a weighted nearest-neighbour classifier. This allows us to find asymptotically optimal vector nonnegative weights, which has rather simple form. show that ratio regret this classifier unweighted k-nearest neighbour depends only on dimension d feature vectors, and not underlying populations. The improvement is greatest when d=4, but thereafter decreases as $d\rightarrow\infty$. popular bagged nearest can also be regarded...

10.1214/12-aos1049 article EN other-oa The Annals of Statistics 2012-10-01

Variable Selection with Error Control: Another Look at Stability Selection

OPENALEX - Publications

Rajen D. Shah Richard J. Samworth

Stability Selection was recently introduced by Meinshausen and Buhlmann (2010) as a very general technique designed to improve the performance of variable selection algorithm. It is based on aggregating results applying procedure subsamples data. We introduce variant, called Complementary Pairs (CPSS), derive bounds both expected number variables included CPSS that have low probability under original procedure, high are excluded. These require no (e.g. exchangeability) assumptions underlying...

10.1111/j.1467-9868.2011.01034.x article EN Journal of the Royal Statistical Society Series B (Statistical Methodology) 2012-06-21

High Dimensional Change Point Estimation via Sparse Projection

OPENALEX - Publications

Tengyao Wang Richard J. Samworth

Summary Change points are a very common feature of ‘big data’ that arrive in the form data stream. We study high dimensional time series which, at certain points, mean structure changes sparse subset co-ordinates. The challenge is to borrow strength across co-ordinates detect smaller than could be observed any individual component series. propose two-stage procedure called inspect for estimation change points: first, we argue good projection direction can obtained as leading left singular...

10.1111/rssb.12243 article EN cc-by Journal of the Royal Statistical Society Series B (Statistical Methodology) 2017-08-11

Choice of neighbor order in nearest-neighbor classification

OPENALEX - Publications

Peter Hall Byeong U. Park Richard J. Samworth

The kth-nearest neighbor rule is arguably the simplest and most intuitively appealing nonparametric classification procedure. However, application of this method inhibited by lack knowledge about its properties, in particular, manner which it influenced value k; absence techniques for empirical choice k. In present paper we detail way k determines misclassification error. We consider two models, Poisson Binomial, training samples. Under first model, data are recorded a stream “assigned” to...

10.1214/07-aos537 article EN The Annals of Statistics 2008-10-01

Maximum Likelihood Estimation of a Multi-Dimensional Log-Concave Density

OPENALEX - Publications

Madeleine Cule Richard J. Samworth Michael Stewart

Summary Let X1,…,Xn be independent and identically distributed random vectors with a (Lebesgue) density f. We first prove that, probability 1, there is unique log-concave maximum likelihood estimator f^n of The use this attractive because, unlike kernel estimation, the method fully automatic, no smoothing parameters to choose. Although existence proof non-constructive, we can reformulate issue computing in terms non-differentiable convex optimization problem, thus combine techniques...

10.1111/j.1467-9868.2010.00753.x article EN Journal of the Royal Statistical Society Series B (Statistical Methodology) 2010-10-12

Neogene overflow of Northern Component Water at the Greenland‐Scotland Ridge

OPENALEX - Publications

H. Poore Richard J. Samworth Nicky White Stephen Jones I Nick McCave

In the North Atlantic Ocean, flow of Deep Water (NADW), and its ancient counterpart Northern Component (NCW), across Greenland‐Scotland Ridge (GSR) is thought to have played an important role in ocean circulation. Over last 60 Ma, Iceland Plume has dynamically supported area which encompasses GSR. Consequently, bathymetry GSR varied with time due a combination lithospheric plate cooling fluctuations temperature buoyancy within underlying convecting mantle. Here, we reassess importance...

10.1029/2005gc001085 article EN Geochemistry Geophysics Geosystems 2006-06-01

Statistical and computational trade-offs in estimation of sparse principal components

OPENALEX - Publications

Tengyao Wang Quentin Berthet Richard J. Samworth

In recent years, sparse principal component analysis has emerged as an extremely popular dimension reduction technique for high-dimensional data. The theoretical challenge, in the simplest case, is to estimate leading eigenvector of a population covariance matrix under assumption that this sparse. An impressive range estimators have been proposed; some these are fast compute, while others known achieve minimax optimal rate over certain Gaussian or sub-Gaussian classes. paper, we show that,...

10.1214/15-aos1369 article EN other-oa The Annals of Statistics 2016-09-12

Random-projection Ensemble Classification

OPENALEX - Publications

Timothy I. Cannings Richard J. Samworth

Summary We introduce a very general method for high dimensional classification, based on careful combination of the results applying an arbitrary base classifier to random projections feature vectors into lower space. In one special case that we study in detail, are divided disjoint groups, and within each group select projection yielding smallest estimate test error. Our random-projection ensemble then aggregates selected projections, with data-driven voting threshold determine final...

10.1111/rssb.12228 article EN cc-by Journal of the Royal Statistical Society Series B (Statistical Methodology) 2017-06-30

Efficient multivariate entropy estimation via $k$-nearest neighbour distances

OPENALEX - Publications

Thomas B. Berrett Richard J. Samworth Ming Yuan

Many statistical procedures, including goodness-of-fit tests and methods for independent component analysis, rely critically on the estimation of entropy a distribution. In this paper, we seek estimators that are efficient achieve local asymptotic minimax lower bound with respect to squared error loss. To end, study weighted averages originally proposed by Kozachenko Leonenko [Probl. Inform. Transm. 23 (1987), 95–101], based $k$-nearest neighbour distances sample $n$ identically distributed...

10.1214/18-aos1688 article EN The Annals of Statistics 2018-11-30

Diffuse Large B-Cell Lymphoma Classification System That Associates Normal B-Cell Subset Phenotypes With Prognosis

OPENALEX - Publications

Karen Dybkær Martin Bøgsted Steffen Falgreen Julie Støve Bødker Malene Krag Kjeldsen and 18 more

Purpose Current diagnostic tests for diffuse large B-cell lymphoma use the updated WHO criteria based on biologic, morphologic, and clinical heterogeneity. We propose a refined classification system subset-specific B-cell–associated gene signatures (BAGS) in normal hierarchy, hypothesizing that it can provide new biologic insight prognostic value. Patients Methods combined fluorescence-activated cell sorting, expression profiling, statistical modeling to generate BAGS naive, centrocyte,...

10.1200/jco.2014.57.7080 article EN Journal of Clinical Oncology 2015-03-24

Robust inference with knockoffs

OPENALEX - Publications

Rina Foygel Barber Emmanuel J. Candès Richard J. Samworth

We consider the variable selection problem, which seeks to identify important variables influencing a response $Y$ out of many candidate features $X_{1},\ldots ,X_{p}$. wish do so while offering finite-sample guarantees about fraction false positives—selected $X_{j}$ that in fact have no effect on after other are known. When number $p$ is large (perhaps even larger than sample size $n$), and we prior knowledge regarding type dependence between $X$, model-X knockoffs framework nonetheless...

10.1214/19-aos1852 article EN The Annals of Statistics 2020-06-01

Nonparametric independence testing via mutual information

OPENALEX - Publications

Thomas B. Berrett Richard J. Samworth

Summary We propose a test of independence two multivariate random vectors, given sample from the underlying population. Our approach is based on estimation mutual information, whose decomposition into joint and marginal entropies facilitates use recently developed efficient entropy estimators derived nearest neighbour distances. The proposed critical values may be obtained by simulation in case where an approximation to one available or permuting data otherwise. This size guarantees, we...

10.1093/biomet/asz024 article EN Biometrika 2019-04-15

The Conditional Permutation Test for Independence While Controlling for Confounders

OPENALEX - Publications

Thomas B. Berrett Yi Wang Rina Foygel Barber Richard J. Samworth

Summary We propose a general new method, the conditional permutation test, for testing independence of variables X and Y given potentially high dimensional random vector Z that may contain confounding factors. The test permutes entries non-uniformly, to respect existing dependence between thus account presence these confounders. Like randomization Candès co-workers in 2018, our relies on availability an approximation distribution X|Z—whereas their uses this estimate draw X-values, we use...

10.1111/rssb.12340 article EN cc-by Journal of the Royal Statistical Society Series B (Statistical Methodology) 2019-10-21

Single-dose BNT162b2 vaccine protects against asymptomatic SARS-CoV-2 infection

OPENALEX - Publications

Nick K Jones Lucy Rivett Shaun R. Seaman Richard J. Samworth Ben Warne and 69 more

The BNT162b2 mRNA COVID-19 vaccine (Pfizer-BioNTech) is being utilised internationally for mass vaccination. Evidence of single-dose protection against symptomatic disease has encouraged some countries to opt delayed booster doses BNT162b2, but the effect this strategy on rates asymptomatic SARS-CoV-2 infection remains unknown. We previously demonstrated frequent pauci- and amongst healthcare workers (HCWs) during UK’s first wave pandemic, using a comprehensive PCR-based HCW screening...

10.7554/elife.68808 article EN cc-by eLife 2021-04-08

Genomic epidemiology of SARS-CoV-2 in a UK university identifies dynamics of transmission

OPENALEX - Publications

Dinesh Aggarwal Ben Warne Aminu S. Jahun William L. Hamilton Thomas Fieldman and 95 more

Understanding SARS-CoV-2 transmission in higher education settings is important to limit spread between students, and into at-risk populations. In this study, we sequenced 482 isolates from the University of Cambridge 5 October 6 December 2020. We perform a detailed phylogenetic comparison with 972 surrounding community, complemented epidemiological contact tracing data, determine dynamics. observe limited viral introductions university; majority student cases were linked single genetic...

10.1038/s41467-021-27942-w article EN cc-by Nature Communications 2022-02-08

Career intentions of medical students in the UK: a national, cross-sectional study (AIMS study)

OPENALEX - Publications

Tomás Ferreira Alexander M Collins Oliver Y. Feng Richard J. Samworth Rita Horváth

Objective To determine current UK medical students’ career intentions after graduation and on completing the Foundation Programme (FP), to ascertain motivations behind these intentions. Design Cross-sectional, mixed-methods survey of students, using a non-random sampling method. Setting All 44 schools recognised by General Medical Council. Participants students were eligible participate. The study sample consisted 10 486 participants, approximately 25.50% student population. Outcome measures...

10.1136/bmjopen-2023-075598 article EN cc-by-nc BMJ Open 2023-08-01

Theoretical properties of the log-concave maximum likelihood estimator of a multidimensional density

OPENALEX - Publications

Madeleine Cule Richard J. Samworth

We present theoretical properties of the log-concave maximum likelihood estimator a density based on an independent and identically distributed sample in ℝd. Our study covers both case where true underlying is log-concave, this model misspecified. begin by showing that for sequence densities, convergence distribution implies much stronger types – particular, it Hellinger distance even certain exponentially weighted total variation norms. In our main result, we prove existence uniqueness...

10.1214/09-ejs505 article EN cc-by Electronic Journal of Statistics 2010-01-01

Approximation by log-concave distributions, with applications to regression

OPENALEX - Publications

Lutz Dümbgen Richard J. Samworth Dominic Schuhmacher

We study the approximation of arbitrary distributions $P$ on $d$-dimensional space by with log-concave density. Approximation means minimizing a Kullback--Leibler-type functional. show that such an exists if and only has finite first moments is not supported some hyperplane. Furthermore we this depends continuously respect to Mallows distance $D_1(\cdot,\cdot)$. This result implies consistency maximum likelihood estimator density under fairly general conditions. It also allows us prove...

10.1214/10-aos853 article EN The Annals of Statistics 2011-03-09

Global rates of convergence in log-concave density estimation

OPENALEX - Publications

Arlene K. H. Kim Richard J. Samworth

The estimation of a log-concave density on $\mathbb{R}^{d}$ represents central problem in the area nonparametric inference under shape constraints. In this paper, we study performance estimators with respect to global loss functions, and adopt minimax approach. We first show that no statistical procedure based sample size $n$ can estimate squared Hellinger function supremum risk smaller than order $n^{-4/5}$, when $d=1$, $n^{-2/(d+1)}$ $d\geq2$. particular, reveals sense which, $d\geq3$, is...

10.1214/16-aos1480 article EN other-oa The Annals of Statistics 2016-11-23

Generalized Additive and Index Models with Shape Constraints

OPENALEX - Publications

Yining Chen Richard J. Samworth

Summary We study generalized additive models, with shape restrictions (e.g. monotonicity, convexity and concavity) imposed on each component of the prediction function. show that this framework facilitates a non-parametric estimator component, obtained by maximizing likelihood. The procedure is free tuning parameters under mild conditions proved to be uniformly consistent compact intervals. More generally, our methodology can applied index models. Here again, justified theoretical grounds...

10.1111/rssb.12137 article EN cc-by Journal of the Royal Statistical Society Series B (Statistical Methodology) 2015-10-26

Isotonic regression in general dimensions

OPENALEX - Publications

Qiyang Han Tengyao Wang Sabyasachi Chatterjee Richard J. Samworth

We study the least squares regression function estimator over class of real-valued functions on $[0,1]^{d}$ that are increasing in each coordinate. For uniformly bounded signals and with a fixed, cubic lattice design, we establish achieves minimax rate order $n^{-\min\{2/(d+2),1/d\}}$ empirical $L_{2}$ loss, up to polylogarithmic factors. Further, prove sharp oracle inequality, which reveals particular when true is piecewise constant $k$ hyperrectangles, enjoys faster, adaptive convergence...

10.1214/18-aos1753 article EN The Annals of Statistics 2019-08-03

Recent Progress in Log-Concave Density Estimation

OPENALEX - Publications

Richard J. Samworth

In recent years, log-concave density estimation via maximum likelihood has emerged as a fascinating alternative to traditional nonparametric smoothing techniques, such kernel estimation, which require the choice of one or more bandwidths. The purpose this article is describe some properties class densities on $\mathbb{R}^{d}$ make it so attractive from statistical perspective, and outline latest methodological, theoretical computational advances in area.

10.1214/18-sts666 article EN Statistical Science 2018-11-01

A Unifying Tutorial on Approximate Message Passing

OPENALEX - Publications

Oliver Y. Feng Ramji Venkataramanan Cynthia Rush Richard J. Samworth

10.1561/2200000092 article EN Foundations and Trends® in Machine Learning 2022-01-01

Coming Soon ...