- Metabolomics and Mass Spectrometry Studies
- Electronic Health Records Systems
- Biomedical Text Mining and Ontologies
- Healthcare Technology and Patient Monitoring
- Patient Safety and Medication Errors
- Topic Modeling
- Genetic Associations and Epidemiology
- Meta-analysis and systematic reviews
- Genomics and Rare Diseases
- Pharmacovigilance and Adverse Drug Reactions
- Machine Learning in Healthcare
- Pharmaceutical Practices and Patient Outcomes
- Advanced Text Analysis Techniques
- Ethics in Clinical Research
- Cancer, Lipids, and Metabolism
- Cancer Genomics and Diagnostics
- Lipid metabolism and disorders
- Medical Coding and Health Information
- Appendicitis Diagnosis and Management
- BRCA gene mutations in cancer
- Lipid metabolism and biosynthesis
- Sepsis Diagnosis and Treatment
- Natural Language Processing Techniques
- Mobile Crowdsensing and Crowdsourcing
- Emergency and Acute Care Studies
Cincinnati Children's Hospital Medical Center
2013-2023
University of Cincinnati Medical Center
2014-2020
University of Cincinnati
2013-2017
Abstract Objective Health care generated data have become an important source for clinical and genomic research. Often, investigators create iteratively refine phenotype algorithms to achieve high positive predictive values (PPVs) or sensitivity, thereby identifying valid cases controls. These the greatest utility when validated shared by multiple health systems. Materials Methods We report current status impact of Phenotype KnowledgeBase (PheKB, http://phekb.org ), online environment...
Non-alcoholic fatty liver disease (NAFLD) is a common chronic illness with genetically heterogeneous background that can be accompanied by considerable morbidity and attendant health care costs. The pathogenesis progression of NAFLD complex many unanswered questions. We conducted genome-wide association studies (GWASs) using both adult pediatric participants from the Electronic Medical Records Genomics (eMERGE) Network to identify novel genetic contributors this condition.First, natural...
Electronic health records (EHRs) are increasingly used for clinical and translational research through the creation of phenotype algorithms. Currently, algorithms most commonly represented as noncomputable descriptive documents knowledge artifacts that detail protocols querying diagnoses, symptoms, procedures, medications, and/or text-driven medical concepts, primarily meant human comprehension. We present desiderata developing a computable representation model (PheRM).A team clinicians...
Manual eligibility screening (ES) for a clinical trial typically requires labor-intensive review of patient records that utilizes many resources. Leveraging state-of-the-art natural language processing (NLP) and information extraction (IE) technologies, we sought to improve the efficiency physician decision-making in enrollment. In order markedly reduce pool potential candidates staff screening, developed an automated ES algorithm identify patients who meet core characteristics oncology...
Geocoding and characterizing geographic, community, environmental characteristics of study participants is frequently done in epidemiological studies. However, participant addresses are identifiable protected health information (PHI) geocoding must be conducted a Health Insurance Portability Accountability Act-compliant manner. Our objective was to create software application for this process that limitations current approaches.We used containerization platform DeGAUSS (Decentralized...
Abstract Objectives (1) To develop an automated eligibility screening (ES) approach for clinical trials in urban tertiary care pediatric emergency department (ED); (2) to assess the effectiveness of natural language processing (NLP), information extraction (IE), and machine learning (ML) techniques on real-world data trials. Data methods We collected criteria 13 randomly selected, disease-specific actively enrolling patients between January 1, 2010 August 31, 2012. In parallel, we...
(1) To evaluate a state-of-the-art natural language processing (NLP)-based approach to automatically de-identify large set of diverse clinical notes. (2) measure the impact de-identification on performance information extraction algorithms de-identified documents.A cross-sectional study that included 3503 stratified, randomly selected notes (over 22 note types) from five million documents produced at one largest US pediatric hospitals. Sensitivity, precision, F value two automated systems...
Objective To present a series of experiments: (1) to evaluate the impact pre-annotation on speed manual annotation clinical trial announcements; and (2) test for potential bias, if is utilized.
Despite significant advances in knowledge of the genetic architecture asthma, specific contributors to variability burden between populations remain uncovered.To identify additional susceptibility factors asthma European American and African populations.A phenotyping algorithm mining electronic medical records was developed validated recruit cases with control subjects from Electronic Medical Records Genomics network. Genome-wide association analyses were performed pediatric adult ancestry...
We report the first pediatric specific Phenome-Wide Association Study (PheWAS) using electronic medical records (EMRs). Given early success of PheWAS in adult populations, we investigated feasibility this approach cohorts which associations between a previously known genetic variant and wide range clinical or physiological traits were evaluated. Although computationally intensive, has potential to reveal disease mechanistic relationships network phenotypes.Data on 5049 samples European...
Objective Cohort selection is challenging for large-scale electronic health record (EHR) analyses, as International Classification of Diseases 9th edition (ICD-9) diagnostic codes are notoriously unreliable disease predictors. Our objective was to develop, evaluate, and validate an automated algorithm determining Autism Spectrum Disorder (ASD) patient cohort from EHR. We demonstrate its utility via the largest investigation date co-occurrence patterns medical comorbidities in ASD. Methods...
Background: A high-quality gold standard is vital for supervised, machine learning-based, clinical natural language processing (NLP) systems. In NLP projects, expert annotators traditionally create the standard. However, traditional annotation expensive and time-consuming. To reduce cost of annotation, general projects have turned to crowdsourcing based on Web 2.0 technology, which involves submitting smaller subtasks a coordinated marketplace workers Internet. Many studies been conducted in...
Although electronic health records (EHRs) have the potential to provide a foundation for quality and safety algorithms, few studies measured their impact on automated adverse event (AE) medical error (ME) detection within neonatal intensive care unit (NICU) environment.This paper presents two phenotyping AE ME algorithms (ie, IV infiltrations, narcotic medication oversedation dosing errors) describes manual annotation of airway management medication/fluid AEs from NICU EHRs.From 753 patient...
Summary The objective of this study is to develop an algorithm accurately identify children with severe early onset childhood obesity (ages 1–5.99 years) using structured and unstructured data from the electronic health record (EHR). Childhood increases risk factors for cardiovascular morbidity vascular disease. Accurate definition a high precision phenotype through standardize tool critical success large-scale genomic studies validating rare monogenic variants causing obesity. Rule based...
Objective To evaluate a proposed natural language processing (NLP) and machine-learning based automated method to risk stratify abdominal pain patients by analyzing the content of electronic health record (EHR).
Common variations at the loci harboring fat mass and obesity gene (FTO), MC4R, TMEM18 are consistently reported as being associated with body index (BMI) especially in adult population. In order to confirm this effect pediatric population five European ancestry cohorts from eMERGE-II network (CCHMC-BCH) were evaluated.Data on 5049 samples of obtained Electronic Medical Records (EMRs) two large academic centers different genotyped cohorts. For all available samples, gender, age, height,...
Abstract Background/Objectives Melanocortin-4 receptor (MC4R) plays an essential role in food intake and energy homeostasis. More than 170 MC4R variants have been described over the past two decades, with conflicting reports regarding prevalence phenotypic effects of these diverse cohorts. To determine frequency large cohort different ancestries, we evaluated coding region for 20,537 eMERGE participants sequencing data plus additional 77,454 independent individuals genome-wide genotyping at...
In this study we implemented and developed state-of-the-art machine learning (ML) natural language processing (NLP) technologies built a computerized algorithm for medication reconciliation. Our specific aims are: (1) to develop discrepancy detection between patients' discharge prescriptions (structured data) medications documented in free-text clinical notes (unstructured data); (2) assess the performance of on real-world reconciliation data. We collected prescription lists all 271 patients...
The current study aims to fill the gap in available healthcare de-identification resources by creating a new sharable dataset with realistic Protected Health Information (PHI) without reducing value of data for research. By releasing annotated gold standard corpus Data Use Agreement we would like encourage other Computational Linguists experiment our and develop machine learning models de-identification. This paper describes: (1) modifications required Institutional Review Board before...
Abstract The electronic Medical Records and Genomics (eMERGE) Network assessed the feasibility of deploying portable phenotype rule-based algorithms with natural language processing (NLP) components added to improve performance existing using health records (EHRs). Based on scientific merit predicted difficulty, eMERGE selected six phenotypes enhance NLP. We performance, portability, ease use. summarized lessons learned by: (1) challenges; (2) best practices address challenges based evidence...