- Statistical Methods and Bayesian Inference
- Statistical Methods and Inference
- Bayesian Methods and Mixture Models
- Biochemical Acid Research Studies
- Privacy-Preserving Technologies in Data
- Microbial Metabolic Engineering and Bioproduction
- Advanced Causal Inference Techniques
- Data Quality and Management
- Bayesian Modeling and Causal Inference
- Statistical Methods in Clinical Trials
- Optimal Experimental Design Methods
- Complex Network Analysis Techniques
- Metabolism and Genetic Disorders
- Advanced Clustering Algorithms Research
- Census and Population Estimation
- Advanced Statistical Methods and Models
- Enzyme Catalysis and Immobilization
- Marine and coastal ecosystems
- Quantum chaos and dynamical systems
- Chemical Synthesis and Reactions
- Survey Methodology and Nonresponse
- Advanced Multi-Objective Optimization Algorithms
- Mitochondrial Function and Pathology
- ATP Synthase and ATPases Research
- Machine Learning and Data Classification
University College London
1996-2024
Council for Scientific and Industrial Research
2024
Azotic Technologies (United Kingdom)
2024
Cardiff University
2022-2023
The Alan Turing Institute
2023
Lancaster University
2019-2020
University of Southampton
2009-2017
Croda (United Kingdom)
2012
Duke University
2006
Biochemical Society
1996-1999
In many observational studies, analysts estimate treatment effects using propensity scores, e.g. by matching or sub-classifying on the scores. When some values of covariates are missing, can use multiple imputation to fill in missing data, scores based m completed datasets, and effects. We compare two approaches implement this process. first, analyst estimates effect score within each data set, averages estimates. second approach, for record across performs with these averaged effect....
In logistic regression, separation occurs when a linear combination of the predictors can perfectly classify part or all observations in sample, and as result, finite maximum likelihood estimates regression coefficients do not exist. Gelman et al. (2008) recommended independent Cauchy distributions default priors for even case separation, reported posterior modes their analyses. As mean does exist prior, natural question is whether means under separation. We prove theorems that provide...
To limit disclosures, statistical agencies and other data disseminators can release partially synthetic, public use microdata sets. These comprise the units originally surveyed; but some collected values, for example, sensitive values at high risk of disclosure or key identifiers, are replaced with multiple draws from models. Because original records on file, there remain risks identifications. In this paper, we describe how to evaluate identification in synthetic data, accounting released...
A computer oriented algorithm for evaluating the moments of impulse response, is proposed. This particularly useful simplifying dynamics large-order systems using technique moment matching.
There is increasing appetite for analysing populations of network data due to the fast-growing body applications demanding such methods. While methods exist provide readily interpretable summaries heterogeneous populations, these are often descriptive or ad hoc, lacking any formal justification. In contrast, principled analysis results difficult relate back applied problem interest. Motivated by two complementary examples, we develop a Bayesian framework appropriately model complex while...
Abstract In many observational studies, analysts estimate causal effects using propensity scores, e.g. by matching, sub‐classifying, or inverse probability weighting based on the scores. Estimation of scores is complicated when some values covariates are missing. Analysts can use multiple imputation to create completed data sets from which be estimated. We propose a general location mixture model for imputations that assumes control units latent (i) whose drawn same distributions as treated...
Use of administrative data for research and planning services has increased over recent decades due to the value large, rich information available. However, concerns about release sensitive or personal associated disclosure risk can lead lengthy approval processes restricted access. This delay prevent production timely evidence. A promising solution facilitate more efficient access is create synthetic versions original datasets which are less likely hold confidential minimise risk. Such may...
Stochastic search variable selection (SSVS) algorithms provide an appealing and widely used approach for searching good subsets of predictors while simultaneously estimating posterior model probabilities model-averaged predictive distributions. This article proposes a two-level generalization SSVS to account missing accommodating uncertainty in the relationships between these predictors. Bayesian approaches allowing that are at random require on joint distribution We show performance can be...
Abstract Over the past three decades, synthetic data methods for statistical disclosure control have continually evolved, but mainly within domain of survey sets. There are certain characteristics administrative databases, such as their size, which present challenges from a synthesis perspective and require special attention. This paper, through fitting saturated count models, presents method that is suitable databases. It tuned by two parameters, σ α. The allows large categorical sets to be...
In typical implementations of multiple imputation for missing data, analysts create m completed datasets based on approximately independent draws model parameters. We use theoretical arguments and simulations to show that, provided is large, the not necessary. fact, appropriate dependent can improve precision relative draws. It also eliminates sometimes difficult task obtaining draws; example, in fully Bayesian models MCMC, avoid search a subsampling interval that ensures all illustrate with...
The paper develops a class of priors that leads to equivalent posterior inference for odds ratio parameters based on prospective and retrospective models categorical response data. results are applicable both unmatched matched case‐control studies. hold general link functions response. proposed method can accommodate multiple possibly ordered disease states. applied the analysis discrete subtypes in an ongoing study colorectal cancer. A simulation illustrates need carefully considering prior...
Missing responses occur in many industrial or medical experiments, for example clinical trials where slow acting treatments are assessed.Finding efficient designs such experiments can be problematic since it is not known at the design stage which observations will missing.The literature mainly focuses on assessing robustness of missing data scenarios, rather than finding optimal this situation.Imhof, Song and Wong (2002) propose a framework search, based expected information matrix.We...