NFDI4DS | UHH-SEMS - Publication Details

Jerome P. Reiter

ORCID: 0000-0002-8374-3832

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5058305379

Research Areas

Statistical Methods and Bayesian Inference
Privacy-Preserving Technologies in Data
Statistical Methods and Inference
Advanced Causal Inference Techniques
Firm Innovation and Growth
Survey Methodology and Nonresponse
Bayesian Methods and Mixture Models
Manufacturing Process and Optimization
Census and Population Estimation
Advanced Statistical Process Monitoring
Data Quality and Management
Global trade and economics
Data-Driven Disease Surveillance
Bayesian Modeling and Causal Inference
Scheduling and Optimization Algorithms
Industrial Vision Systems and Defect Detection
Cryptography and Data Security
Healthcare Policy and Management
Privacy, Security, and Data Protection
Advanced Statistical Methods and Models
Data Analysis with R
Survey Sampling and Estimation Techniques
Computational Physics and Python Applications
Scientific Computing and Data Management
demographic modeling and climate adaptation

Duke University
2015-2024

United States Census Bureau
2014-2023

Statistical and Applied Mathematical Sciences Institute
2018-2023

Office of the National Coordinator for Health Information Technology
2018

Emory University
2018

Social Science Research Council
2017-2018

National Bureau of Economic Research
2011-2016

University of Minnesota
2011-2016

Colorado State University
2016

University of South Carolina
2012

Multiple Imputation for Missing Data via Sequential Regression Trees

OPENALEX - Publications

Lane F. Burgette Jerome P. Reiter

Multiple imputation is particularly well suited to deal with missing data in large epidemiologic studies, because typically these studies support a wide range of analyses by many users. Some may involve complex modeling, including interactions and nonlinear relations. Identifying such relations encoding them models, for example, the conditional regressions multiple via chained equations, can be daunting tasks numbers categorical continuous variables. The authors present nonparametric...

10.1093/aje/kwq260 article EN American Journal of Epidemiology 2010-09-14

A comparison of two methods of estimating propensity scores after multiple imputation

OPENALEX - Publications

Robin Mitra Jerome P. Reiter

In many observational studies, analysts estimate treatment effects using propensity scores, e.g. by matching or sub-classifying on the scores. When some values of covariates are missing, can use multiple imputation to fill in missing data, scores based m completed datasets, and effects. We compare two approaches implement this process. first, analyst estimates effect score within each data set, averages estimates. second approach, for record across performs with these averaged effect....

10.1177/0962280212445945 article EN Statistical Methods in Medical Research 2012-06-11

The Multiple Adaptations of Multiple Imputation

OPENALEX - Publications

Jerome P. Reiter Trivellore E. Raghunathan

AbstractMultiple imputation was first conceived as a tool that statistical agencies could use to handle nonresponse in large-sample public surveys. In the last two decades, multiple-imputation framework has been adapted for other contexts. For example, individual researchers multiple missing data small samples, disseminate multiply-imputed sets purposes of protecting confidentiality, and survey methodologists epidemiologists correct measurement errors. some these settings, Rubin's original...

10.1198/016214507000000932 article EN Journal of the American Statistical Association 2007-12-01

Releasing Multiply Imputed, Synthetic Public use Microdata: An Illustration and Empirical Study

OPENALEX - Publications

Jerome P. Reiter

Summary The paper presents an illustration and empirical study of releasing multiply imputed, fully synthetic public use microdata. Simulations based on data from the US Current Population Survey are used to evaluate potential validity inferences for a variety descriptive analytic estimands, assess degree protection confidentiality that is afforded by illustrate specification imputation models. Benefits limitations sets discussed.

10.1111/j.1467-985x.2004.00343.x article EN Journal of the Royal Statistical Society Series A (Statistics in Society) 2004-12-15

A Framework for Evaluating the Utility of Data Altered to Protect Confidentiality

OPENALEX - Publications

Alan F. Karr Christine N. Kohnen Anna Oganian Jerome P. Reiter Ashish Sanil

When releasing data to the public, statistical agencies and survey organizations typically alter values in order protect confidentiality of respondents' identities attribute values. To select among wide variety alteration methods, require tools for evaluating utility proposed releases. Such measures can be combined with disclosure risk gauge risk-utility tradeoffs competing methods. This article presents focused on differences inferences obtained from altered corresponding original data....

10.1198/000313006x124640 article EN The American Statistician 2006-07-18

Data Quality and Record Linkage Techniques

OPENALEX - Publications

Jerome P. Reiter

"Data Quality and Record Linkage Techniques." Journal of the American Statistical Association, 103(482), p. 881

10.1198/jasa.2008.s229 article EN Journal of the American Statistical Association 2008-06-01

Assessment of urban vulnerability towards floods using an indicator-based approach – a case study for Santiago de Chile

OPENALEX - Publications

Annemarie Müller Jerome P. Reiter Ulrike Weiland

Abstract. Regularly occurring flood events do have a history in Santiago de Chile, the capital city of Chile and study area for this research. The analysis events, resulting damage its causes are crucial prerequisites development risk prevention measures. goal research is to empirically investigate vulnerability towards floods as one component risk. assessment based on application multi-scale (individual, household, municipal level) set indicators use broad range data. case-specific...

10.5194/nhess-11-2107-2011 article EN cc-by Natural hazards and earth system sciences 2011-08-04

Maternal Prenatal Pregnancy-Related Anxiety and Spontaneous Preterm Birth in Baltimore, Maryland

OPENALEX - Publications

Suezanne Tangerose Orr Jerome P. Reiter Dan G. Blazer Sherman A. James

Objective: To focus on the relationship between pregnancy-related anxiety and spontaneous preterm birth. Psychosocial factors have been subject of inquiries about etiology birth; a factor recent interest is maternal prenatal (worries concerns related to pregnancy). Methods: From 1991 1993, total 1820 women completed study questionnaire during their first visit clinics in Baltimore, Maryland. Pregnancy-related was assessed using six questions from Prenatal Social Environment Inventory; scores...

10.1097/psy.0b013e3180cac25d article EN Psychosomatic Medicine 2007-07-01

Depressive Symptoms and Indicators of Maternal Health Status during Pregnancy

OPENALEX - Publications

Suezanne Tangerose Orr Dan G. Blazer Sherman A. James Jerome P. Reiter

Objectives: Depressive symptoms are common among women, especially those who of childbearing age or pregnant. Prior studies have suggested that an increased burden depressive is associated with diminished health and functional status, but these were primarily middle-aged older adults. In the current study, we investigated relationship between status pregnant women. Methods: Women enrolled in study at their first prenatal visit to hospital-based clinics administered interview contained Center...

10.1089/jwh.2006.0116 article EN Journal of Women s Health 2007-05-01

Global Measures of Data Utility for Microdata Masked for Disclosure Limitation

OPENALEX - Publications

Mi‐Ja Woo Jerome P. Reiter Anna Oganian Alan F. Karr

When releasing microdata to the public, data disseminators typically alter original protect confidentiality of database subjects' identities and sensitive attributes. However, such alteration negatively impacts utility (quality) released data. In this paper, we present quantitative measures for masked microdata, with aim improving disseminators' evaluations competing masking strategies. The measures, which are global in that they reflect similarities between entire distributions data,...

10.29012/jpc.v1i1.568 article EN cc-by-nc-nd Journal of Privacy and Confidentiality 2009-04-01

An empirical evaluation of easily implemented, nonparametric methods for generating synthetic datasets

OPENALEX - Publications

Jörg Drechsler Jerome P. Reiter

10.1016/j.csda.2011.06.006 article EN Computational Statistics & Data Analysis 2011-06-24

Towards Unrestricted Public Use Business Microdata: The Synthetic Longitudinal Business Database

OPENALEX - Publications

Satkartar K. Kinney Jerome P. Reiter Arnold P. Reznek Javier Miranda Ron S. Jarmin and 1 more

Dans la plupart des pays, les instituts nationaux de statistique ne publient pas micro-données relatives aux entreprises. Les publier présente en effet un risque trop élevé rupture confidentialité. Ce peut être évité par recours à données synthétiques---des simulées partir modèles statistiques reproduisant loi véritables micro-données. cet article, nous décrivons une application cette stratégie création d'une telle base résultats du recensement économique annuel entreprises américaines....

10.1111/j.1751-5823.2011.00153.x article FR International Statistical Review 2011-11-21

Nonparametric Bayesian Multiple Imputation for Incomplete Categorical Variables in Large-Scale Assessment Surveys

OPENALEX - Publications

Yajuan Si Jerome P. Reiter

In many surveys, the data comprise a large number of categorical variables that suffer from item nonresponse. Standard methods for multiple imputation, like log-linear models or sequential regression can fail to capture complex dependencies and be difficult implement effectively in high dimensions. We present fully Bayesian, joint modeling approach imputation based on Dirichlet process mixtures multinomial distributions. The automatically while being computationally expedient. prior...

10.3102/1076998613480394 article EN Journal of Educational and Behavioral Statistics 2013-03-30

Interval estimation for treatment effects using propensity score matching

OPENALEX - Publications

Jennifer Hill Jerome P. Reiter

In causal studies without random assignment of treatment, effects can be estimated using matched treated and control samples, where matches are obtained propensity scores. Propensity score matching reduce bias in treatment effect estimators cases the samples have overlapping covariate distributions. Despite its application many applied problems, there is no universally employed approach to interval estimation when matching. this article, we present evaluate approaches

10.1002/sim.2277 article EN Statistics in Medicine 2005-10-11

Privacy preserving regression modelling via distributed computation

OPENALEX - Publications

Ashish Sanil Alan F. Karr Xiaodong Lin Jerome P. Reiter

Reluctance of data owners to share their possibly confidential or proprietary with others who own related databases is a serious impediment conducting mutually beneficial mining analysis. We address the case vertically partitioned -- multiple owners/agencies each possess few attributes every record. focus on agencies wanting conduct linear regression analysis complete records without disclosing values attributes. This paper describes an algorithm that enables such compute exact coefficients...

10.1145/1014052.1014139 article EN 2004-08-22

Secure Regression on Distributed Databases

OPENALEX - Publications

Alan F. Karr Xiaodong Lin Ashish Sanil Jerome P. Reiter

This article presents several methods for performing linear regression on the union of distributed databases that preserve, to varying degrees, confidentiality those databases. Such can be used by federal or state statistical agencies share information from their individual databases, make such available others. Secure data integration, which provides lowest level protection, actually integrates but in a manner no database owner determine origin any records other than its own. Regression,...

10.1198/106186005x47714 article EN Journal of Computational and Graphical Statistics 2005-05-21

Estimating Risks of Identification Disclosure in Microdata

OPENALEX - Publications

Jerome P. Reiter

When statistical agencies release microdata to the public, malicious users (intruders) may be able link records in released data external databases. Releasing ways that fail prevent such identifications discredit agency or, for some data, constitute a breach of law. To limit disclosures, often altered versions data; however, there usually remain risks identification. This article applies and extends framework developed by Duncan Lambert computing probabilities identification sampled units....

10.1198/016214505000000619 article EN Journal of the American Statistical Association 2005-11-06

A Note on Bayesian Inference After Multiple Imputation

OPENALEX - Publications

Xiang Zhou Jerome P. Reiter

Abstract This article is aimed at practitioners who plan to use Bayesian inference on multiply-imputed datasets in settings where posterior distributions of the parameters interest are not approximately Gaussian. We seek steer away from a naive approach inference, namely estimating distribution each completed dataset and averaging functionals these distributions. demonstrate that this results unreliable inferences. A better mix draws dataset, mixed summarize distribution. Using simulations,...

10.1198/tast.2010.09109 article EN The American Statistician 2010-05-01

Estimating Risks of Identification Disclosure in Partially Synthetic Data

OPENALEX - Publications

Jerome P. Reiter Robin Mitra

To limit disclosures, statistical agencies and other data disseminators can release partially synthetic, public use microdata sets. These comprise the units originally surveyed; but some collected values, for example, sensitive values at high risk of disclosure or key identifiers, are replaced with multiple draws from models. Because original records on file, there remain risks identifications. In this paper, we describe how to evaluate identification in synthetic data, accounting released...

10.29012/jpc.v1i1.567 article EN cc-by-nc-nd Journal of Privacy and Confidentiality 2009-04-01

Environmental contributors to the achievement gap

OPENALEX - Publications

Marie Lynn Miranda Dohyeong Kim Jerome P. Reiter M. Alicia Overstreet Galeano Pamela Maxson

10.1016/j.neuro.2009.07.012 article EN NeuroToxicology 2009-07-29

Random Forests for Generating Partially Synthetic, Categorical Data

OPENALEX - Publications

Gregory Caiola Jerome P. Reiter

Several national statistical agencies are now releasing partially synthetic, public use microdata. These comprise the units in original database with sensitive or identifying values replaced simulated from models. Specifying synthesis models can be daunting databases that includemany variables of diverse types. variablesmay related inways difficult to capture standard parametric tools. In this article, we describe how random forests adapted generate synthetic data for categorical variables....

10.5555/1747335.1747337 article EN Transactions on data privacy 2010-04-01

Coming Soon ...