- Statistical Methods and Inference
- Bayesian Methods and Mixture Models
- Advanced Statistical Methods and Models
- Statistical Methods and Bayesian Inference
- Sparse and Compressive Sensing Techniques
- Machine Learning and Algorithms
- Markov Chains and Monte Carlo Methods
- Blind Source Separation Techniques
- Random Matrices and Applications
- Neural Networks and Applications
- Face and Expression Recognition
- Advanced Statistical Process Monitoring
- Bayesian Modeling and Causal Inference
- Statistical Methods in Clinical Trials
- Advanced Causal Inference Techniques
- Control Systems and Identification
- SARS-CoV-2 and COVID-19 Research
- Gaussian Processes and Bayesian Inference
- Gene expression and cancer classification
- Domain Adaptation and Few-Shot Learning
- Complex Systems and Time Series Analysis
- Statistical and numerical algorithms
- Machine Learning and Data Classification
- Multiple Myeloma Research and Treatments
- Sensory Analysis and Statistical Methods
University of Cambridge
2015-2024
University of Sheffield
2024
University of Edinburgh
2020
University of Chicago
2019
University of Wisconsin–Madison
2018
University of Michigan
2018
Columbia University
2018
Sungshin Women's University
2018
Statistical Service
2018
California Institute of Technology
2016
Journal Article A useful variant of the Davis–Kahan theorem for statisticians Get access Y. Yu, Yu Statistical Laboratory, University Cambridge, Wilberforce Road, Cambridge CB3 0WB, U.K., y.yu@statslab.cam.ac.ukt.wang@statslab.cam.ac.ukr.samworth@statslab.cam.ac.uk Search other works by this author on: Oxford Academic Google Scholar T. Wang, Wang R. J. Samworth Biometrika, Volume 102, Issue 2, June 2015, Pages 315–323, https://doi.org/10.1093/biomet/asv008 Published: 28 April 2014 history...
We derive an asymptotic expansion for the excess risk (regret) of a weighted nearest-neighbour classifier. This allows us to find asymptotically optimal vector nonnegative weights, which has rather simple form. show that ratio regret this classifier unweighted k-nearest neighbour depends only on dimension d feature vectors, and not underlying populations. The improvement is greatest when d=4, but thereafter decreases as $d\rightarrow\infty$. popular bagged nearest can also be regarded...
Stability Selection was recently introduced by Meinshausen and Buhlmann (2010) as a very general technique designed to improve the performance of variable selection algorithm. It is based on aggregating results applying procedure subsamples data. We introduce variant, called Complementary Pairs (CPSS), derive bounds both expected number variables included CPSS that have low probability under original procedure, high are excluded. These require no (e.g. exchangeability) assumptions underlying...
Summary Change points are a very common feature of ‘big data’ that arrive in the form data stream. We study high dimensional time series which, at certain points, mean structure changes sparse subset co-ordinates. The challenge is to borrow strength across co-ordinates detect smaller than could be observed any individual component series. propose two-stage procedure called inspect for estimation change points: first, we argue good projection direction can obtained as leading left singular...
The kth-nearest neighbor rule is arguably the simplest and most intuitively appealing nonparametric classification procedure. However, application of this method inhibited by lack knowledge about its properties, in particular, manner which it influenced value k; absence techniques for empirical choice k. In present paper we detail way k determines misclassification error. We consider two models, Poisson Binomial, training samples. Under first model, data are recorded a stream “assigned” to...
Summary Let X1,…,Xn be independent and identically distributed random vectors with a (Lebesgue) density f. We first prove that, probability 1, there is unique log-concave maximum likelihood estimator f^n of The use this attractive because, unlike kernel estimation, the method fully automatic, no smoothing parameters to choose. Although existence proof non-constructive, we can reformulate issue computing in terms non-differentiable convex optimization problem, thus combine techniques...
In the North Atlantic Ocean, flow of Deep Water (NADW), and its ancient counterpart Northern Component (NCW), across Greenland‐Scotland Ridge (GSR) is thought to have played an important role in ocean circulation. Over last 60 Ma, Iceland Plume has dynamically supported area which encompasses GSR. Consequently, bathymetry GSR varied with time due a combination lithospheric plate cooling fluctuations temperature buoyancy within underlying convecting mantle. Here, we reassess importance...
In recent years, sparse principal component analysis has emerged as an extremely popular dimension reduction technique for high-dimensional data. The theoretical challenge, in the simplest case, is to estimate leading eigenvector of a population covariance matrix under assumption that this sparse. An impressive range estimators have been proposed; some these are fast compute, while others known achieve minimax optimal rate over certain Gaussian or sub-Gaussian classes. paper, we show that,...
Summary We introduce a very general method for high dimensional classification, based on careful combination of the results applying an arbitrary base classifier to random projections feature vectors into lower space. In one special case that we study in detail, are divided disjoint groups, and within each group select projection yielding smallest estimate test error. Our random-projection ensemble then aggregates selected projections, with data-driven voting threshold determine final...
Many statistical procedures, including goodness-of-fit tests and methods for independent component analysis, rely critically on the estimation of entropy a distribution. In this paper, we seek estimators that are efficient achieve local asymptotic minimax lower bound with respect to squared error loss. To end, study weighted averages originally proposed by Kozachenko Leonenko [Probl. Inform. Transm. 23 (1987), 95–101], based $k$-nearest neighbour distances sample $n$ identically distributed...
Purpose Current diagnostic tests for diffuse large B-cell lymphoma use the updated WHO criteria based on biologic, morphologic, and clinical heterogeneity. We propose a refined classification system subset-specific B-cell–associated gene signatures (BAGS) in normal hierarchy, hypothesizing that it can provide new biologic insight prognostic value. Patients Methods combined fluorescence-activated cell sorting, expression profiling, statistical modeling to generate BAGS naive, centrocyte,...
We consider the variable selection problem, which seeks to identify important variables influencing a response $Y$ out of many candidate features $X_{1},\ldots ,X_{p}$. wish do so while offering finite-sample guarantees about fraction false positives—selected $X_{j}$ that in fact have no effect on after other are known. When number $p$ is large (perhaps even larger than sample size $n$), and we prior knowledge regarding type dependence between $X$, model-X knockoffs framework nonetheless...
Summary We propose a test of independence two multivariate random vectors, given sample from the underlying population. Our approach is based on estimation mutual information, whose decomposition into joint and marginal entropies facilitates use recently developed efficient entropy estimators derived nearest neighbour distances. The proposed critical values may be obtained by simulation in case where an approximation to one available or permuting data otherwise. This size guarantees, we...
Summary We propose a general new method, the conditional permutation test, for testing independence of variables X and Y given potentially high dimensional random vector Z that may contain confounding factors. The test permutes entries non-uniformly, to respect existing dependence between thus account presence these confounders. Like randomization Candès co-workers in 2018, our relies on availability an approximation distribution X|Z—whereas their uses this estimate draw X-values, we use...
The BNT162b2 mRNA COVID-19 vaccine (Pfizer-BioNTech) is being utilised internationally for mass vaccination. Evidence of single-dose protection against symptomatic disease has encouraged some countries to opt delayed booster doses BNT162b2, but the effect this strategy on rates asymptomatic SARS-CoV-2 infection remains unknown. We previously demonstrated frequent pauci- and amongst healthcare workers (HCWs) during UK’s first wave pandemic, using a comprehensive PCR-based HCW screening...
Understanding SARS-CoV-2 transmission in higher education settings is important to limit spread between students, and into at-risk populations. In this study, we sequenced 482 isolates from the University of Cambridge 5 October 6 December 2020. We perform a detailed phylogenetic comparison with 972 surrounding community, complemented epidemiological contact tracing data, determine dynamics. observe limited viral introductions university; majority student cases were linked single genetic...
Objective To determine current UK medical students’ career intentions after graduation and on completing the Foundation Programme (FP), to ascertain motivations behind these intentions. Design Cross-sectional, mixed-methods survey of students, using a non-random sampling method. Setting All 44 schools recognised by General Medical Council. Participants students were eligible participate. The study sample consisted 10 486 participants, approximately 25.50% student population. Outcome measures...
We present theoretical properties of the log-concave maximum likelihood estimator a density based on an independent and identically distributed sample in ℝd. Our study covers both case where true underlying is log-concave, this model misspecified. begin by showing that for sequence densities, convergence distribution implies much stronger types – particular, it Hellinger distance even certain exponentially weighted total variation norms. In our main result, we prove existence uniqueness...
We study the approximation of arbitrary distributions $P$ on $d$-dimensional space by with log-concave density. Approximation means minimizing a Kullback--Leibler-type functional. show that such an exists if and only has finite first moments is not supported some hyperplane. Furthermore we this depends continuously respect to Mallows distance $D_1(\cdot,\cdot)$. This result implies consistency maximum likelihood estimator density under fairly general conditions. It also allows us prove...
The estimation of a log-concave density on $\mathbb{R}^{d}$ represents central problem in the area nonparametric inference under shape constraints. In this paper, we study performance estimators with respect to global loss functions, and adopt minimax approach. We first show that no statistical procedure based sample size $n$ can estimate squared Hellinger function supremum risk smaller than order $n^{-4/5}$, when $d=1$, $n^{-2/(d+1)}$ $d\geq2$. particular, reveals sense which, $d\geq3$, is...
Summary We study generalized additive models, with shape restrictions (e.g. monotonicity, convexity and concavity) imposed on each component of the prediction function. show that this framework facilitates a non-parametric estimator component, obtained by maximizing likelihood. The procedure is free tuning parameters under mild conditions proved to be uniformly consistent compact intervals. More generally, our methodology can applied index models. Here again, justified theoretical grounds...
We study the least squares regression function estimator over class of real-valued functions on $[0,1]^{d}$ that are increasing in each coordinate. For uniformly bounded signals and with a fixed, cubic lattice design, we establish achieves minimax rate order $n^{-\min\{2/(d+2),1/d\}}$ empirical $L_{2}$ loss, up to polylogarithmic factors. Further, prove sharp oracle inequality, which reveals particular when true is piecewise constant $k$ hyperrectangles, enjoys faster, adaptive convergence...
In recent years, log-concave density estimation via maximum likelihood has emerged as a fascinating alternative to traditional nonparametric smoothing techniques, such kernel estimation, which require the choice of one or more bandwidths. The purpose this article is describe some properties class densities on $\mathbb{R}^{d}$ make it so attractive from statistical perspective, and outline latest methodological, theoretical computational advances in area.