- Biomedical Text Mining and Ontologies
- Health, Environment, Cognitive Aging
- Machine Learning in Healthcare
- Health disparities and outcomes
- Electronic Health Records Systems
- Artificial Intelligence in Healthcare
- Climate Change and Health Impacts
- Birth, Development, and Health
- Bioinformatics and Genomic Networks
- Pregnancy and preeclampsia studies
- Nutrition, Genetics, and Disease
- Health Systems, Economic Evaluations, Quality of Life
- Maternal and Perinatal Health Interventions
- Pregnancy and Medication Impact
- Data-Driven Disease Surveillance
- Healthcare Policy and Management
- Atmospheric and Environmental Gas Dynamics
- Genetic Associations and Epidemiology
- Statistical Methods and Bayesian Inference
- Chronic Obstructive Pulmonary Disease (COPD) Research
- Topic Modeling
- Maternal and fetal healthcare
- Statistical Methods in Clinical Trials
- Medical Coding and Health Information
- Scientific Computing and Data Management
Saint Vincent College
2023-2025
University of Pennsylvania
2017-2024
Children's Hospital of Philadelphia
2017-2023
Center for Excellence in Education
2023
California University of Pennsylvania
2021
University of Puerto Rico at Carolina
2021
Community Initiatives
2021
University of South Carolina
2020
Rathenau Instituut
2019
Columbia University
2011-2017
To develop a semantic representation for clinical research eligibility criteria to automate semistructured information extraction from text.An analysis pipeline called and (EliXR) was developed that integrates syntactic parsing tree pattern mining discover common patterns in 1000 randomly selected http://ClinicalTrials.gov. The were aggregated enriched with unified medical language systems knowledge form criteria.The authors arrived at 175 patterns, which 12 role labels connected by their...
Abstract Objective An individual’s birth month has a significant impact on the diseases they develop during their lifetime. Previous studies reveal relationships between and several including atherothrombosis, asthma, attention deficit hyperactivity disorder, myopia, leaving most completely unexplored. This retrospective population study systematically explores relationship seasonal affects at lifetime disease risk for 1688 conditions. Methods We developed hypothesis-free method that...
Abstract Objectives We propose a one-shot, privacy-preserving distributed algorithm to perform logistic regression (ODAL) across multiple clinical sites. Materials and Methods ODAL effectively utilizes the information from local site (where patient-level data are accessible) incorporates first-order (ODAL1) second-order (ODAL2) gradients of likelihood function other sites construct an estimator without requiring iterative communication or transferring data. evaluated via extensive simulation...
The electronic health record (EHR) has become increasingly ubiquitous. At the same time, professionals have been turning to this resource for access data that is needed delivery of care and clinical research. There little doubt EHR made both these functions easier than earlier days when we relied on paper-based records. Coupled with modern database warehouse systems, high-speed networks, ability share others are large number challenges arguably limit optimal use OBJECTIVES: Our goal was...
Abstract Objective We developed and evaluated a privacy-preserving One-shot Distributed Algorithm to fit multicenter Cox proportional hazards model (ODAC) without sharing patient-level information across sites. Materials Methods Using data from single site combined with only aggregated other sites, we constructed surrogate likelihood function, approximating the partial function obtained using all By maximizing each local estimate of parameter, ODAC estimator was as weighted average...
The burgeoning adoption of electronic health records (EHR) introduces a golden opportunity for studying individual manifestations myriad diseases, which is called 'EHR phenotyping'. In this paper, we break down concept by: relating it to phenotype definitions from Johannsen; comparing cohort identification and disease subtyping; introducing new 'verotype' (Latin: vere = true, actually) represent the 'true' population similar patients treatment purposes through integration genotype,...
Phenotypes have gained increased notoriety in the clinical and biological domain owing to their application numerous areas such as discovery of disease genes drug targets, phylogenetics pharmacogenomics. Phenotypes, defined observable characteristics organisms, can be seen one bridges that lead a translation experimental findings into applications thereby support 'bench bedside' efforts. However, build this translational bridge, common universal understanding phenotypes is required goes...
Our study investigates the temporality of factors that modulate risk for developing hypertension (HTN) among patients with obstructive sleep apnea (OSA) without preexisting HTN at baseline. cohort consisted OSA cases (based on International Classification Diseases, 9th/10th Revision) 20 common comorbidities selected using a previously validated electronic health record (EHR)-based algorithm. We constructed survival model to estimate time-to-first diagnosis (among HTN). included those along...
To use linked electronic medical and dental records to discover associations between periodontitis conditions independent of a priori hypotheses.This case-control study included 2475 patients who underwent treatment at the College Dental Medicine Columbia University NewYork-Presbyterian Hospital. Our cases are received periodontal our controls maintenance but no treatment. Chi-square analysis was performed for codes logistic regression used adjust confounders.Our method replicated several...
Birth month and climate impact lifetime disease risk, while the underlying exposures remain largely elusive. We seek to uncover distal risk factors these relationships by probing relationship between global exposure variance birth season.This study utilizes electronic health record data from 6 sites representing 10.5 million individuals in 3 countries (United States, South Korea, Taiwan). obtained month-disease curves each site a case-control manner. Next, we correlated curve with exposure....
We introduce the spike-and-slab group lasso (SSGL) for Bayesian estimation and variable selection in linear regression with grouped variables. further extend SSGL to sparse generalized additive models (GAMs), thereby introducing first nonparametric variant of methodology. Our model simultaneously performs estimation, while our fully Bayes treatment mixture proportion allows complexity control automatic self-adaptivity different levels sparsity. develop theory uniquely characterize global...
Many drugs commonly prescribed during pregnancy lack a fetal safety recommendation - called FDA 'category C' drugs. This study aims to classify these into harmful and safe categories using knowledge gained from chemoinformatics (i.e., pharmacological similarity with of known effect) empirical data derived Electronic Health Records). Our loss cohort contains 14,922 affected 33,043 unaffected pregnancies our congenital anomalies 5,658 31,240 infants. We trained random forest unknown class or...
Mutations in transcription factor (TF) genes are frequently observed tumors, often leading to aberrant transcriptional activity. Unfortunately, TFs considered undruggable due the absence of targetable enzymatic To address this problem, we developed CRAFTT, a computational drug-repositioning approach for targeting TF CRAFTT combines ChIP-seq with drug-induced expression profiling identify small molecules that can specifically perturb Application ENCODE datasets revealed known drug-TF...