Sarah C. Lotspeich

ORCID: 0000-0001-5380-2427
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Statistical Methods and Inference
  • Statistical Methods and Bayesian Inference
  • Advanced Causal Inference Techniques
  • HIV/AIDS Research and Interventions
  • Survey Methodology and Nonresponse
  • HIV Research and Treatment
  • Health Systems, Economic Evaluations, Quality of Life
  • HIV-related health complications and treatments
  • HIV/AIDS drug development and treatment
  • Genetic Neurodegenerative Diseases
  • Data-Driven Disease Surveillance
  • Data Quality and Management
  • Fibromyalgia and Chronic Fatigue Syndrome Research
  • Child Abuse and Trauma
  • Customer churn and segmentation
  • Pregnancy and Medication Impact
  • Child Welfare and Adoption
  • Healthcare Systems and Reforms
  • Food Security and Health in Diverse Populations
  • Emergency and Acute Care Studies
  • Health, Environment, Cognitive Aging
  • Cardiovascular Health and Risk Factors
  • Survey Sampling and Estimation Techniques
  • Machine Learning in Healthcare
  • Consumer Attitudes and Food Labeling

Wake Forest University
2023-2025

University of North Carolina at Chapel Hill
2021-2025

Walter and Eliza Hall Institute of Medical Research
2023-2024

Vanderbilt University Medical Center
2020-2023

Kaiser Permanente Washington Health Research Institute
2021

University of Pennsylvania
2021

Vanderbilt University
2019-2020

Modeling symptom progression to identify ideal subjects for a Huntington's disease clinical trial is problematic since time diagnosis, key covariate, can be heavily censored. Imputation an appealing strategy that replaces the censored covariate with its conditional mean, but existing methods saw over 200% bias under heavy censoring. Calculating means well requires estimating and then integrating survival function of from value infinity. To estimate flexibly, use semiparametric Cox model...

10.1080/10618600.2024.2444323 article EN Journal of Computational and Graphical Statistics 2025-01-06

The allostatic load index (ALI) is a composite measure of whole-person health. Data from electronic health records (EHR) present huge opportunity to operationalize the ALI in learning system, except they are prone missingness and errors. Validation EHR data (e.g., through chart reviews) can provide better-quality data, but realistically, only subset patients' be validated, most protocols do not recover missing data. Using representative sample 1000 patients at an extensive system (100 whom...

10.48550/arxiv.2502.05380 preprint EN arXiv (Cornell University) 2025-02-07

<title>Abstract</title> Disparities in healthy eating relate to disparities well-being, leading disproportionate rates of diseases like type-2 diabetes communities that face more challenges accessing nutritious food. They can be driven by individual- and neighborhood-level factors, a person’s distance from home the nearest grocery store or socioeconomic status their community, respectively. Quantifying these is key developing targeted interventions, there are limitations with currently...

10.21203/rs.3.rs-6348313/v1 preprint EN Research Square (Research Square) 2025-04-03

Missing data is a common challenge when analyzing epidemiological data, and imputation often used to address this issue. Here, we investigate the scenario where covariate in an analysis has missingness will be imputed. There are recommendations include outcome from model for missing covariates, but it not necessarily clear if recommendation always holds why sometimes true. We examine deterministic (i.e. single with fixed values) stochastic or multiple random methods their implications...

10.1177/09622802241244608 article EN Statistical Methods in Medical Research 2024-04-16

Abstract Measurement errors are present in many data collection procedures and can harm analyses by biasing estimates. To correct for measurement error, researchers often validate a subsample of records then incorporate the information learned from this validation sample into estimation. In practice, is selected using simple random sampling (SRS). However, SRS leads to inefficient estimates because it ignores on error-prone variables, which be highly correlated unknown truth. Applying...

10.1111/rssa.12689 article EN Journal of the Royal Statistical Society Series A (Statistics in Society) 2021-04-15

To develop and validate algorithms for predicting 30-day fatal nonfatal opioid-related overdose using statewide data sources including prescription drug monitoring program data, Hospital Discharge Data System Tennessee (TN) vital records. Current prevention efforts in TN rely on descriptive retrospective analyses without prognostication.Study included 3 041 668 patients with 71 479 191 controlled substance prescriptions from 2012 to 2017. Statewide socioeconomic indicators were used train,...

10.1093/jamia/ocab218 article EN cc-by-nc-nd Journal of the American Medical Informatics Association 2021-10-09

Barriers continue to limit access viral load (VL) monitoring across sub-Saharan Africa adversely impacting control of the HIV epidemic. The objective this study was determine whether systems and processes required realize potential rapid molecular technology are available at a prototypical lower-level (i.e., level III) health center in rural Uganda. In open-label pilot study, participants underwent parallel VL testing both central laboratory standard care) on-site using GeneXpert HIV-1...

10.1371/journal.pgph.0001678 article EN cc-by PLOS Global Public Health 2023-03-27

In modern observational studies using electronic health records or other routinely collected data, both the outcome and covariates of interest can be error‐prone their errors often correlated. A cost‐effective solution is two‐phase design, under which are observed for all subjects during first phase that information used to select a validation subsample accurate measurements these variables in second phase. Previous research on measurement error problems largely focused scenarios where there...

10.1002/sim.8799 article EN Statistics in Medicine 2020-11-03

Abstract Observational databases provide unprecedented opportunities for secondary use in biomedical research. However, these data can be error‐prone and must validated before use. It is usually unrealistic to validate the whole database because of resource constraints. A cost‐effective alternative a two‐phase design that validates subset records enriched information about particular research question. We consider odds ratio estimation under differential outcome exposure misclassification...

10.1002/cjs.11772 article EN cc-by Canadian Journal of Statistics 2023-03-31

Persons living with HIV engage in routine clinical care, generating large amounts of data observational cohorts. These are often error-prone, and directly using them biomedical research could bias estimation give misleading results. A cost-effective solution is the two-phase design, under which error-prone variables observed for all patients during Phase I, that information used to select auditing II. For example, Caribbean, Central, South America network epidemiology (CCASAnet) selected a...

10.1111/biom.13512 article EN Biometrics 2021-07-02

The landscape of survival analysis is constantly being revolutionized to answer biomedical challenges, most recently the statistical challenge censored covariates rather than outcomes. There are many promising strategies tackle covariates, including weighting, imputation, maximum likelihood, and Bayesian methods. Still, this a relatively fresh area research, different from areas outcomes (i.e., analysis) or missing covariates. In review, we discuss unique challenges encountered when handling...

10.1146/annurev-statistics-040522-095944 article EN Annual Review of Statistics and Its Application 2023-09-08

Audits play a critical role in maintaining the integrity of observational cohort data. While previous work has validated audit process, sending trained auditors to sites ("travel-audits") can be costly. We investigate efficacy training conduct "self-audits."In 2017, eight research groups Caribbean, Central, and South America network for HIV Epidemiology each audited subset their patient records randomly selected by data coordinating center at Vanderbilt. Designated investigators site...

10.1017/cts.2019.442 article EN cc-by-nc-nd Journal of Clinical and Translational Science 2019-11-29

As earthquakes occur frequently in Latin America and can cause significant disruptions HIV care, we sought to analyze patterns of care for adults at American clinical sites experiencing a earthquake within the past two decades. Retrospective cohort study. Adults receiving least "moderate intensity" (Modified Mercalli scale) Caribbean, Central South network epidemiology (CCASAnet) contributed data from 2003 2017. Interrupted Time Series models were fit with discontinuities site-specific dates...

10.1016/j.puhip.2024.100479 article EN cc-by-nc-nd Public Health in Practice 2024-02-14

ABSTRACT People living with HIV on antiretroviral therapy often have undetectable virus levels by standard assays, but “latent” still persists in viral reservoirs. Eliminating these reservoirs is the goal of cure research. The quantitative outgrowth assay (QVOA) commonly used to estimate reservoir size, that is, infectious units per million (IUPM) HIV-persistent resting CD4+ T cells. A new variation QVOA, ultra deep sequencing (UDSA), was recently developed further quantifies number lineages...

10.1093/biomtc/ujad018 article EN Biometrics 2024-01-29

Healthy foods are essential for a healthy life, but accessing food can be more challenging some people than others. This disparity in access may lead to disparities well-being, potentially with disproportionate rates of diseases communities that face challenges (i.e., low-access communities). Identifying low-access, high-risk targeted interventions is public health priority, current methods quantify rely on distance measures either computationally simple (like the length shortest...

10.48550/arxiv.2405.16385 preprint EN arXiv (Cornell University) 2024-05-25

In longitudinal studies, the devices used to measure exposures can change from visit visit. Calibration wherein a subset of participants is measured using both at follow-up, may be assess between-device differences (i.e., errors). Then, statistical methods are needed adjust for and missing measurement data that often appear in calibration studies. Regression multiple imputation two possible methods. We compared linear regression with simulation study, considering various real-world scenarios...

10.1093/aje/kwae169 article EN American Journal of Epidemiology 2024-07-02

Censored, missing, and error-prone covariates are all coarsened data types for which the true values unknown. Many methods to handle unobserved values, including imputation, shared between these types, with nuances based on mechanism dominating any other available information. For example, in prospective studies, time a specific disease diagnosis will be incompletely observed if only some patients diagnosed by end of follow-up. Specifically, times randomly right-censored, patients'...

10.48550/arxiv.2410.10723 preprint EN arXiv (Cornell University) 2024-10-14

Validation studies are often used to obtain more reliable information in settings with error‐prone data. Validated data on a subsample of subjects can be together all improve estimation. In practice, than one round validation may required, and direct application standard approaches for combining into analyses lead inefficient estimators since the available from intermediate steps is only partially considered or even completely ignored. this paper, we present two novel extensions multiple...

10.1002/sim.9967 article EN Statistics in Medicine 2023-11-21

Analysts are often confronted with censoring, wherein some variables not observed at their true value, but rather a value that is known to fall above or below truth. While much attention has been given the analysis of censored outcomes, contemporary focus shifted covariates, as well. Missing data overcome using multiple imputation, which leverages entire dataset by replacing missing values informed placeholders, and this method can be modified for also incorporating partial information from...

10.1002/bimj.202100250 article EN Biometrical Journal 2022-02-24

To select outcomes for clinical trials testing experimental therapies Huntington disease, a fatal neurodegenerative disorder, analysts model how potential change over time. Yet, subjects with disease are often observed at different levels of progression. account these differences, include time to diagnosis as covariate when modeling outcomes, but this is censored. One popular solution imputation, whereby we impute censored values using predictions from the given other data, then analyze...

10.48550/arxiv.2303.01602 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Objective: In longitudinal studies, devices used to measure exposures, like pulse wave velocity (PWV), can change from visit visit. Calibration where a subset of participants receive measurements both at follow-up, are often assess differences in the device measurements. Regression calibration and multiple imputation common statistical methods correct for those differences, but no study yet exists compare two when quantity interest is exposure over time. We compared hypothetical PWV its...

10.1161/circ.147.suppl_1.p599 article EN Circulation 2023-02-28

Missing data is a common challenge when analyzing epidemiological data, and imputation often used to address this issue. Here, we investigate the scenario where covariate in an analysis has missingness will be imputed. There are recommendations include outcome from model for missing covariates, but it not necessarily clear if recommendation always holds why sometimes true. We examine deterministic (i.e., single with fixed values) stochastic or multiple random methods their implications...

10.48550/arxiv.2310.17434 preprint EN other-oa arXiv (Cornell University) 2023-01-01
Coming Soon ...