- Statistical Methods in Clinical Trials
- Health Systems, Economic Evaluations, Quality of Life
- Genetic Associations and Epidemiology
- Advanced Causal Inference Techniques
- Genetic Mapping and Diversity in Plants and Animals
- Cancer Genomics and Diagnostics
- Meta-analysis and systematic reviews
- Gene expression and cancer classification
- Genetic and phenotypic traits in livestock
- Respiratory viral infections research
- Bioinformatics and Genomic Networks
- Cancer Immunotherapy and Biomarkers
- Lung Cancer Treatments and Mutations
- Biomedical Ethics and Regulation
- Bayesian Methods and Mixture Models
- Chronic Obstructive Pulmonary Disease (COPD) Research
- RNA modifications and cancer
- Colorectal Cancer Treatments and Studies
- Neonatal Respiratory Health Research
- Cell Image Analysis Techniques
- Cancer-related molecular mechanisms research
- Genetic Syndromes and Imprinting
- Economic and Financial Impacts of Cancer
- Cardiovascular Function and Risk Factors
- AI in cancer detection
University of North Carolina at Chapel Hill
2013-2025
Instituto Superior Politécnico Metropolitano de Angola
2022-2025
Harvard University
2016-2024
Google (United States)
2020-2024
Biostat (United States)
2024
The University of Texas MD Anderson Cancer Center
2022-2023
Johns Hopkins University
2023
Dana-Farber Cancer Institute
2023
Stanford University
2023
Harvard University Press
2022
Quantitative traits analyzed in Genome-Wide Association Studies (GWAS) are often nonnormally distributed. For such traits, association tests based on standard linear regression subject to reduced power and inflated type I error finite samples. Applying the rank-based inverse normal transformation (INT) distributed has become common practice GWAS. However, different variations INT-based testing have not been formally defined, guidance is lacking when use which approach. In this paper, we...
Genome-wide association studies (GWASs) require accurate cohort phenotyping, but expert labeling can be costly, time intensive, and variable. Here, we develop a machine learning (ML) model to predict glaucomatous optic nerve head features from color fundus photographs. We used the vertical cup-to-disc ratio (VCDR), diagnostic parameter cardinal endophenotype for glaucoma, in 65,680 Europeans UK Biobank (UKB). A GWAS of ML-based VCDR identified 299 independent genome-wide significant (GWS; p...
Identifying the causal variants and mechanisms that drive complex traits diseases remains a core problem in human genetics. The majority of these have individually weak effects lie non-coding gene-regulatory elements where we lack complete understanding how single nucleotide alterations modulate transcriptional processes to affect phenotypes. To address this, measured activity 221,412 trait-associated had been statistically fine-mapped using Massively Parallel Reporter Assay (MPRA) 5 diverse...
Abstract Although high-dimensional clinical data (HDCD) are increasingly available in biobank-scale datasets, their use for genetic discovery remains challenging. Here we introduce an unsupervised deep learning model, Representation Learning Genetic Discovery on Low-Dimensional Embeddings (REGLE), discovering associations between variants and HDCD. REGLE leverages variational autoencoders to compute nonlinear disentangled embeddings of HDCD, which become the inputs genome-wide association...
ardiovascular trials often use the time to a clinical event as primary end point when evaluating new treatment, versus control, via and statistical significance criteria.For past 50 years, hazard ratio (HR) has been routinely used for quantifying treatment effect.However, it is difficult interpret using measure, such HR, there no reference available from control arm.Moreover, valid HR analysis requires proportional hazards (PH) assumption: that of curves constant over time.This assumption...
Significance Understanding dendritic cell (DC) migration during an immune response is fundamental to defining the rules that govern T cell-mediated immunity. We recently described mice deficient in pattern recognition receptor NLRP10 (NLR family, pyrin domain containing 10) with a severe DC defect. Using whole-exome sequencing, we discovered this defect was due mutation of guanine nucleotide exchange factor Dock8 (dedicator cytokinesis 8). DOCK8 regulates cytoskeleton dynamics leukocytes,...
Genome-wide association studies (GWASs) examine the between genotype and phenotype while adjusting for a set of covariates. Although covariates may have non-linear or interactive effects, due to challenge specifying model, GWAS often neglect such terms. Here we introduce DeepNull, method that identifies adjusts covariate effects using deep neural network. In analyses simulated real data, demonstrate DeepNull maintains tight control type I error increasing statistical power by up 20% in...
This study sought to demonstrate the statistical and utilitarian properties of restricted mean survival time (RMST) lost (RMTL) for assessing treatments heart failure (HF) with reduced ejection fraction. Although hazard ratio (HR) is most commonly used measure quantify treatment effects in HF clinical trials, HRs may be difficult interpret require proportional hazards assumption valid. RMST RMTL are intuitive summaries groupwise that without model assumptions. Patient time-to-event data were...
In the Dapagliflozin Evaluation to Improve Lives of Patients With Preserved Ejection Fraction Heart Failure (DELIVER) trial, dapagliflozin reduced risk time first worsening heart failure (HF) event or cardiovascular death in patients with HF mildly preserved ejection fraction (EF).
Survival analyses of novel agents with long-term responders often exhibit differential hazard rates over time. Such proportional hazards violations (PHV) may reduce the power log-rank test and lead to misinterpretation trial results. We aimed characterize incidence study attributes associated PHVs in phase III oncology trials assess utility restricted mean survival time maximum combination as additional analyses.
This cohort study assesses the relative stability of median and mean survival time estimates reported in cancer clinical trials.
Estimations of the treatment effect on overall survival (OS) may be influenced by post-progression therapies (PPTs). It is unclear how often OS analyses account for PPT effects. The purpose this cross-sectional analysis was to determine prevalence accounting effects in phase III oncology trials. We screened two-arm, superiority design, III, randomised, trials reporting from ClinicalTrials.gov. primary outcome frequency adjusting confounding. Logistic regressions computed ORs association...
In comparative studies, treatment effect is often assessed using a binary outcome that indicates response to the therapy. Commonly used summary measures for include cumulative and current rates at specific time point. The rate sometimes called probability of being in (PBIR), which regards patient as responder only if they have achieved remain present. methods practice estimating these rates, however, may not be appropriate. Moreover, whereas an effective expected achieve rapid sustained...
Genome-wide association studies (GWASs) have uncovered a wealth of associations between common variants and human phenotypes. Here, we present an integrative analysis GWAS summary statistics from 36 phenotypes to decipher multitrait genetic architecture its link with biological mechanisms. Our framework incorporates mapping along investigation the breakdown into clusters harboring similar profiles. Focusing on two subsets immunity metabolism phenotypes, then demonstrate how within can be...
Allelic series are of candidate therapeutic interest because the existence a dose-response relationship between functionality gene and degree or severity phenotype. We define an allelic as collection variants in which increasingly deleterious mutations lead to large phenotypic effects, we have developed gene-based rare-variant association test specifically targeted identifying genes containing series. Building on well-known burden sequence kernel (SKAT), specify variety models covering...
A previous study demonstrated that power against the (unobserved) true effect for primary end point (PEP) of most phase III oncology trials is low, suggesting an increased risk false-negative findings in field late-phase oncology. Fitting models with prognostic covariates a potential solution to improve power; however, extent which leverage this approach, and its impact on trial interpretation at scale, unknown. To end, we hypothesized using multivariable PEP analyses are more likely...
Statistical significance currently defines superiority in phase III oncology trials. However, this practice is increasingly questioned. Here, we estimated the fragility of Using Kaplan-Meier curves for primary endpoints 230 two-arm trials, reconstructed data individual patients. We survival-inferred index (SIFI) by iteratively flipping best responder from experimental arm to control (SIFI B ) until interpretation was changed according threshold each trial. Severe defined SIFI ≤ 1%. This...
Genome-wide association studies (GWAS) are often performed on ratios composed of a numerator trait divided by denominator trait. Examples include body mass index (BMI) and the waist-to-hip ratio, among many others. Explicitly or implicitly, goal forming ratio is typically to adjust for an between denominator. While may be clinically expedient, there several important issues with performing GWAS ratios. Forming does not "adjust" in sense conditioning it, it unclear whether associations...