- Advanced Causal Inference Techniques
- Statistical Methods and Inference
- Statistical Methods in Clinical Trials
- COVID-19 Clinical Research Studies
- SARS-CoV-2 and COVID-19 Research
- Long-Term Effects of COVID-19
- Epigenetics and DNA Methylation
- Health Systems, Economic Evaluations, Quality of Life
- Statistical Methods and Bayesian Inference
- Bayesian Modeling and Causal Inference
- Advanced Chemical Sensor Technologies
- Health, Environment, Cognitive Aging
- Vaccine Coverage and Hesitancy
- Chronic Obstructive Pulmonary Disease (COPD) Research
- Artificial Intelligence in Healthcare
- Data Stream Mining Techniques
- Online and Blended Learning
- Health disparities and outcomes
- Sepsis Diagnosis and Treatment
- Intensive Care Unit Cognitive Disorders
- Machine Learning in Healthcare
- Psychotherapy Techniques and Applications
- Gaussian Processes and Bayesian Inference
- Single-cell and spatial transcriptomics
- Carcinogens and Genotoxicity Assessment
National Institute of Arthritis and Musculoskeletal and Skin Diseases
2024
National Institutes of Health
2024
University of California, Berkeley
2018-2024
Berkeley Public Health Division
2024
Government of the United States of America
2022
Leicester Royal Infirmary
2020
NIHR Leicester Biomedical Research Centre
2019
Common tasks encountered in epidemiology, including disease incidence estimation and causal inference, rely on predictive modelling. Constructing a model can be thought of as learning prediction function (a that takes input covariate data outputs predicted value). Many strategies for functions from (learners) are available, parametric regressions to machine algorithms. It challenging choose learner, it is impossible know advance which one the most suitable particular dataset task. The super...
Epigenetic aging biomarkers are associated with increased morbidity and mortality. We evaluated if occupational exposure to three established chemical carcinogens is acceleration of epigenetic aging. studied workers in China occupationally exposed benzene, trichloroethylene (TCE) or formaldehyde by measuring personal air exposures prior blood collection. Unexposed controls matched age sex were selected from nearby factories. measured leukocyte DNA methylation (DNAm) peripheral white cells...
Susan Grubera* , Hana Leeb, Rachael Phillipsc, Martin Hod & Mark van der Laanca Putnam Data Sciences, LLC, Cambridge, MAb Office of Biostatistics, Center for Drug Evaluation and Research, U.S. Food Administration, Silver Spring, MDc Department University California at Berkeley, CAd Global Affairs Google, Mountain View, CA
Inverse probability weighting (IPW) and targeted maximum likelihood estimation (TMLE) are methodologies that can adjust for confounding selection bias often used causal inference. Both estimators rely on the positivity assumption within strata of confounders there is a positive receiving treatment at all levels under consideration. Practical applications IPW require finite inverse (IP) weights. TMLE requires propensity scores (PS) be bounded away from 0 1. Although truncation improve...
Abstract: Judea Pearl, quoted in Pearl and Mackenzie (2008), stated that “once we have understood why [randomized controlled trials] RCTs work, there is no need to put them on a pedestal treat as the gold standard of causal analysis, which all other methods should emulate.” In Aronow et al. (2024), this claim refuted, drawing results Robins Ritov (1997). The argument made statistical estimation inference tend be fundamentally more difficult observational studies than randomized trials, even...
Objectives: Long COVID is a debilitating condition that impacts millions of Americans, but patients and clinicians have little information on how to prevent this disorder. Vaccination vital tool in preventing acute COVID-19 may confer additional protection against COVID. There limited evidence regarding the optimal timing vaccination (i.e., schedule) minimize risk Methods: We applied Longitudinal Targeted Maximum Likelihood Estimation electronic health record (EHR) data from retrospective...
Introduction Patients presenting with acute undifferentiated breathlessness are commonly encountered in admissions units across the UK. Existing blood biomarkers have clinical utility distinguishing patients single organ pathologies but poor discriminatory power multifactorial presentations. Evaluation of volatile organic compounds (VOCs) exhaled breath offers potential to develop disease states that underpin cardiorespiratory breathlessness, owing their proximity system. To date, there has...
The 21st Century Cures Act of 2016 includes a provision for the U.S. Food and Drug Administration10.13039/100000038 (FDA) to evaluate potential use Real-World Evidence (RWE) support new indications previously approved drugs, satisfy post-approval study requirements. Extracting reliable evidence from Data (RWD) is often complicated by lack treatment randomization, intercurrent events, informative loss follow-up. Targeted Learning (TL) sub-field statistics that provides rigorous framework help...
Human exposure to trichloroethylene (TCE) is linked kidney cancer, autoimmune diseases, and probably non-Hodgkin lymphoma. Additionally, TCE exposed mice cell cultures show altered DNA methylation. To evaluate associations between methylation in humans, we conducted an epigenome-wide association study (EWAS) workers using the HumanMethylation450 BeadChip. Across individual CpG probes, genomic regions, globally (i.e., 450K methylome), investigated differences mean variability of 73 control (<...
Sufficient evidence supports a relationship between certain myeloid neoplasms and exposure to benzene or formaldehyde. DNA methylation could underlie benzene- formaldehyde-induced health outcomes, but data in exposed human populations are limited. We conducted two cross-sectional epigenome-wide association studies (EWAS), one workers another Using HumanMethylation450 BeadChips, we investigated differences blood cell among 50 benzene-exposed subjects 48 controls, 31 formaldehyde-exposed 40...
Investigating acute multifactorial undifferentiated breathlessness and understanding the driving inflammatory processes can be technically challenging in both adults children. Being able to validate noninvasive methods such as breath analysis would a huge clinical advance. The ReCIVA® device allows samples collected directly onto sorbent tubes at bedside for of exhaled volatile organic compounds (eVOCs). We aimed assess feasibility using this acutely breathless patients.Adults hospitalised...
Abstract Background The Targeted Learning roadmap provides a systematic guide for generating and evaluating real-world evidence (RWE). From regulatory perspective, RWE arises from diverse sources such as randomized controlled trials that make use of data, observational studies, other study designs. This paper illustrates principled approach to assessing the validity interpretability RWE. Methods We applied published dose–response association between ritodrine hydrochloride pulmonary edema...
The global prevalence of type 2 diabetes (T2D) has doubled since 1980. Human epidemiological studies support arsenic exposure as a risk factor for T2D, although the precise mechanism is unclear. We hypothesized that chronic ingestion alters glucose homeostasis by impairing adaptive thermogenesis, i.e., body heat production in cold environments. Arsenic pervasive environmental contaminant, with more than 200 million people worldwide currently exposed to arsenic-contaminated drinking water....
Several recently developed methods have the potential to harness machine learning in pursuit of target quantities inspired by causal inference, including inverse weighting, doubly robust estimating equations and substitution estimators like targeted maximum likelihood estimation. There are even more recent augmentations these procedures that can increase robustness, adding a layer cross-validation (cross-validated estimation double learning, as applied equation approaches, respectively)....
In this work we introduce the personalized online super learner (POSL), an personalizable ensemble machine learning algorithm for streaming data. POSL optimizes predictions with respect to baseline covariates, so personalization can vary from completely individualized, that is, optimization subject ID, many individuals, common covariates. As algorithm, learns in real time. a learner, is grounded statistical optimality theory and leverage diversity of candidate algorithms, including...
Targeted Learning is a subfield of statistics that unifies advances in causal inference, machine learning and statistical theory to help answer scientifically impactful questions with confidence. driven by complex problems data science has been implemented diversity real-world scenarios: observational studies missing treatments outcomes, personalized interventions, longitudinal settings time-varying treatment regimes, survival analysis, adaptive randomized trials, mediation networks...
Postacute sequelae of COVID-19 (PASC), also known as long COVID, is a broad grouping range long-term symptoms following acute COVID-19. These can occur across biological systems, leading to challenges in determining risk factors for PASC and the causal etiology this disorder. An understanding characteristics that are predictive future valuable, inform identification high-risk individuals preventative efforts. However, current knowledge regarding limited.