Jerome P. Reiter

ORCID: 0000-0002-8374-3832
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Statistical Methods and Bayesian Inference
  • Privacy-Preserving Technologies in Data
  • Statistical Methods and Inference
  • Advanced Causal Inference Techniques
  • Firm Innovation and Growth
  • Survey Methodology and Nonresponse
  • Bayesian Methods and Mixture Models
  • Manufacturing Process and Optimization
  • Census and Population Estimation
  • Advanced Statistical Process Monitoring
  • Data Quality and Management
  • Global trade and economics
  • Data-Driven Disease Surveillance
  • Bayesian Modeling and Causal Inference
  • Scheduling and Optimization Algorithms
  • Industrial Vision Systems and Defect Detection
  • Cryptography and Data Security
  • Healthcare Policy and Management
  • Privacy, Security, and Data Protection
  • Advanced Statistical Methods and Models
  • Data Analysis with R
  • Survey Sampling and Estimation Techniques
  • Computational Physics and Python Applications
  • Scientific Computing and Data Management
  • demographic modeling and climate adaptation

Duke University
2015-2024

United States Census Bureau
2014-2023

Statistical and Applied Mathematical Sciences Institute
2018-2023

Office of the National Coordinator for Health Information Technology
2018

Emory University
2018

Social Science Research Council
2017-2018

National Bureau of Economic Research
2011-2016

University of Minnesota
2011-2016

Colorado State University
2016

University of South Carolina
2012

Multiple imputation is particularly well suited to deal with missing data in large epidemiologic studies, because typically these studies support a wide range of analyses by many users. Some may involve complex modeling, including interactions and nonlinear relations. Identifying such relations encoding them models, for example, the conditional regressions multiple via chained equations, can be daunting tasks numbers categorical continuous variables. The authors present nonparametric...

10.1093/aje/kwq260 article EN American Journal of Epidemiology 2010-09-14

In many observational studies, analysts estimate treatment effects using propensity scores, e.g. by matching or sub-classifying on the scores. When some values of covariates are missing, can use multiple imputation to fill in missing data, scores based m completed datasets, and effects. We compare two approaches implement this process. first, analyst estimates effect score within each data set, averages estimates. second approach, for record across performs with these averaged effect....

10.1177/0962280212445945 article EN Statistical Methods in Medical Research 2012-06-11

AbstractMultiple imputation was first conceived as a tool that statistical agencies could use to handle nonresponse in large-sample public surveys. In the last two decades, multiple-imputation framework has been adapted for other contexts. For example, individual researchers multiple missing data small samples, disseminate multiply-imputed sets purposes of protecting confidentiality, and survey methodologists epidemiologists correct measurement errors. some these settings, Rubin's original...

10.1198/016214507000000932 article EN Journal of the American Statistical Association 2007-12-01

Summary The paper presents an illustration and empirical study of releasing multiply imputed, fully synthetic public use microdata. Simulations based on data from the US Current Population Survey are used to evaluate potential validity inferences for a variety descriptive analytic estimands, assess degree protection confidentiality that is afforded by illustrate specification imputation models. Benefits limitations sets discussed.

10.1111/j.1467-985x.2004.00343.x article EN Journal of the Royal Statistical Society Series A (Statistics in Society) 2004-12-15

When releasing data to the public, statistical agencies and survey organizations typically alter values in order protect confidentiality of respondents' identities attribute values. To select among wide variety alteration methods, require tools for evaluating utility proposed releases. Such measures can be combined with disclosure risk gauge risk-utility tradeoffs competing methods. This article presents focused on differences inferences obtained from altered corresponding original data....

10.1198/000313006x124640 article EN The American Statistician 2006-07-18

"Data Quality and Record Linkage Techniques." Journal of the American Statistical Association, 103(482), p. 881

10.1198/jasa.2008.s229 article EN Journal of the American Statistical Association 2008-06-01

Abstract. Regularly occurring flood events do have a history in Santiago de Chile, the capital city of Chile and study area for this research. The analysis events, resulting damage its causes are crucial prerequisites development risk prevention measures. goal research is to empirically investigate vulnerability towards floods as one component risk. assessment based on application multi-scale (individual, household, municipal level) set indicators use broad range data. case-specific...

10.5194/nhess-11-2107-2011 article EN cc-by Natural hazards and earth system sciences 2011-08-04

Objective: To focus on the relationship between pregnancy-related anxiety and spontaneous preterm birth. Psychosocial factors have been subject of inquiries about etiology birth; a factor recent interest is maternal prenatal (worries concerns related to pregnancy). Methods: From 1991 1993, total 1820 women completed study questionnaire during their first visit clinics in Baltimore, Maryland. Pregnancy-related was assessed using six questions from Prenatal Social Environment Inventory; scores...

10.1097/psy.0b013e3180cac25d article EN Psychosomatic Medicine 2007-07-01

Objectives: Depressive symptoms are common among women, especially those who of childbearing age or pregnant. Prior studies have suggested that an increased burden depressive is associated with diminished health and functional status, but these were primarily middle-aged older adults. In the current study, we investigated relationship between status pregnant women. Methods: Women enrolled in study at their first prenatal visit to hospital-based clinics administered interview contained Center...

10.1089/jwh.2006.0116 article EN Journal of Women s Health 2007-05-01

When releasing microdata to the public, data disseminators typically alter original protect confidentiality of database subjects' identities and sensitive attributes. However, such alteration negatively impacts utility (quality) released data. In this paper, we present quantitative measures for masked microdata, with aim improving disseminators' evaluations competing masking strategies. The measures, which are global in that they reflect similarities between entire distributions data,...

10.29012/jpc.v1i1.568 article EN cc-by-nc-nd Journal of Privacy and Confidentiality 2009-04-01

Dans la plupart des pays, les instituts nationaux de statistique ne publient pas micro-données relatives aux entreprises. Les publier présente en effet un risque trop élevé rupture confidentialité. Ce peut être évité par recours à données synthétiques---des simulées partir modèles statistiques reproduisant loi véritables micro-données. cet article, nous décrivons une application cette stratégie création d'une telle base résultats du recensement économique annuel entreprises américaines....

10.1111/j.1751-5823.2011.00153.x article FR International Statistical Review 2011-11-21

In many surveys, the data comprise a large number of categorical variables that suffer from item nonresponse. Standard methods for multiple imputation, like log-linear models or sequential regression can fail to capture complex dependencies and be difficult implement effectively in high dimensions. We present fully Bayesian, joint modeling approach imputation based on Dirichlet process mixtures multinomial distributions. The automatically while being computationally expedient. prior...

10.3102/1076998613480394 article EN Journal of Educational and Behavioral Statistics 2013-03-30

In causal studies without random assignment of treatment, effects can be estimated using matched treated and control samples, where matches are obtained propensity scores. Propensity score matching reduce bias in treatment effect estimators cases the samples have overlapping covariate distributions. Despite its application many applied problems, there is no universally employed approach to interval estimation when matching. this article, we present evaluate approaches

10.1002/sim.2277 article EN Statistics in Medicine 2005-10-11

Reluctance of data owners to share their possibly confidential or proprietary with others who own related databases is a serious impediment conducting mutually beneficial mining analysis. We address the case vertically partitioned -- multiple owners/agencies each possess few attributes every record. focus on agencies wanting conduct linear regression analysis complete records without disclosing values attributes. This paper describes an algorithm that enables such compute exact coefficients...

10.1145/1014052.1014139 article EN 2004-08-22

This article presents several methods for performing linear regression on the union of distributed databases that preserve, to varying degrees, confidentiality those databases. Such can be used by federal or state statistical agencies share information from their individual databases, make such available others. Secure data integration, which provides lowest level protection, actually integrates but in a manner no database owner determine origin any records other than its own. Regression,...

10.1198/106186005x47714 article EN Journal of Computational and Graphical Statistics 2005-05-21

When statistical agencies release microdata to the public, malicious users (intruders) may be able link records in released data external databases. Releasing ways that fail prevent such identifications discredit agency or, for some data, constitute a breach of law. To limit disclosures, often altered versions data; however, there usually remain risks identification. This article applies and extends framework developed by Duncan Lambert computing probabilities identification sampled units....

10.1198/016214505000000619 article EN Journal of the American Statistical Association 2005-11-06

Abstract This article is aimed at practitioners who plan to use Bayesian inference on multiply-imputed datasets in settings where posterior distributions of the parameters interest are not approximately Gaussian. We seek steer away from a naive approach inference, namely estimating distribution each completed dataset and averaging functionals these distributions. demonstrate that this results unreliable inferences. A better mix draws dataset, mixed summarize distribution. Using simulations,...

10.1198/tast.2010.09109 article EN The American Statistician 2010-05-01

To limit disclosures, statistical agencies and other data disseminators can release partially synthetic, public use microdata sets. These comprise the units originally surveyed; but some collected values, for example, sensitive values at high risk of disclosure or key identifiers, are replaced with multiple draws from models. Because original records on file, there remain risks identifications. In this paper, we describe how to evaluate identification in synthetic data, accounting released...

10.29012/jpc.v1i1.567 article EN cc-by-nc-nd Journal of Privacy and Confidentiality 2009-04-01

Several national statistical agencies are now releasing partially synthetic, public use microdata. These comprise the units in original database with sensitive or identifying values replaced simulated from models. Specifying synthesis models can be daunting databases that includemany variables of diverse types. variablesmay related inways difficult to capture standard parametric tools. In this article, we describe how random forests adapted generate synthetic data for categorical variables....

10.5555/1747335.1747337 article EN Transactions on data privacy 2010-04-01
Coming Soon ...