- Biomedical Text Mining and Ontologies
- Machine Learning in Healthcare
- Topic Modeling
- Caching and Content Delivery
- Advanced Data Storage Technologies
- Health Systems, Economic Evaluations, Quality of Life
- Genetics, Bioinformatics, and Biomedical Research
- Scientific Computing and Data Management
- Peer-to-Peer Network Technologies
- Treatment of Major Depression
- Cloud Computing and Resource Management
- Data Quality and Management
- Pharmaceutical Practices and Patient Outcomes
- Distributed and Parallel Computing Systems
- Mental Health via Writing
- Human Pose and Action Recognition
- Computational Drug Discovery Methods
- Traditional Chinese Medicine Studies
- Bioinformatics and Genomic Networks
- Cardiovascular Health and Risk Factors
- Tuberculosis Research and Epidemiology
- Suicide and Self-Harm Studies
- Mental Health Research Topics
- Data Visualization and Analytics
- Genetic Associations and Epidemiology
Oak Ridge National Laboratory
2016-2025
Government of the United States of America
2023
University of Utah
2023
University of Louisville
2016-2018
The increasing availability of electronic health record (EHR) systems has created enormous potential for translational research. However, it is difficult to know all the relevant codes related a phenotype due large number available. Traditional data mining approaches often require use patient-level data, which hinders ability share across institutions. In this project, we demonstrate that multi-center large-scale code embeddings can be used efficiently identify features disease interest. We...
The growing availability of electronic health records (EHR) data opens opportunities for integrative analysis multi-institutional EHR to produce generalizable knowledge. A key barrier such analyses is the lack semantic interoperability across different institutions due coding differences. We propose a Multiview Incomplete Knowledge Graph Integration (MIKGI) algorithm integrate information from multiple sources with partially overlapping concept codes enable translations between healthcare...
Abstract Motivation Predicting molecule–disease indications and side effects is important for drug development pharmacovigilance. Comprehensively mining molecule–molecule, disease–disease semantic dependencies can potentially improve prediction performance. Methods We introduce a Multi-Modal REpresentation Mapping Approach to molecular-disease relations (M2REMAP) by incorporating clinical semantics learned from electronic health records (EHR) of 12.6 million patients. Specifically, M2REMAP...
Objectives: To demonstrate an innovative method combining machine learning with comparative effectiveness research techniques and to investigate a hitherto unstudied question about the of common prescribing patterns. Data Sources: United States Veterans Health Administration Corporate Warehouse. Study Design: For Operation Enduring Freedom/Operation Iraqi Freedom veterans major depressive disorder, we generate pharmacotherapy pathways (of antidepressants) using process mining learning. We...
Abstract Background To discover pharmacotherapy prescription patterns and their statistical associations with outcomes through a clinical pathway inference framework applied to real-world data. Methods We apply machine learning steps in our using 2006 2020 cohort of veterans major depressive disorder (MDD). Outpatient antidepressant pharmacy fills, dispensed inpatient medications, emergency department visits, self-harm, all-cause mortality data were extracted from the Department Veterans...
Disk I/O is a major bottleneck limiting the performance and scalability of data intensive applications. A common way to address disk bottlenecks using parallel storage systems utilizing concurrent operation independent components; however, achieving consistently high challenging due static configurations. Modern systems, especially in cloud, enterprise centers, scientific clusters are commonly shared by various applications generating dynamic coexisting access patterns. Nonetheless, these...
The Department of Energy's (DOE) Atmospheric Radiation Measurement (ARM) Climate Research Facility is establishing an adaptive data services and operations architecture in support the Next-Generation ARM as explained its Decadal Vision. In this paper, we describe capabilities Data Center (ADC) upcoming high-performance computing infrastructure Facility.
Background: Electronic health records (EHR) contain vast data in codified and narrative forms, covering thousands of clinical features available for research care. The complexity EHR presents challenges feature representation, information extraction, uncertainty quantification. To address these challenges, we proposed an efficient Aggregated naRrative Codified Health (ARCH) analysis to generate a large-scale comprehensive knowledge graph (KG) features.Methods: ARCH first processes into...
As a result of immense growth digital data in the last decade, energy consumption has become an important issue storage systems. In US alone, centers were projected to consume $4 billion (40 TWh) yearly electricity 2005. This cost had reached $10 (100 2011, and expected be around $20 (200 2016 by doubling itself every 5 years. addition economic burden on companies research institutions, these large scale systems also have negative impact environment. According EPA, generating 1 KWh results...
Storage performance bottlenecks are one of the major threats limiting scalability I/O intensive applications. Parallel storage systems have potential to alleviate through concurrent operation independent components if a parallelism-aware data layout can be continuously guaranteed. Existing use one-layout-fits-all placement strategy that frequently results in sub-optimal parallelism. Guided by association rule mining, graph coloring, bin packing, and network flow techniques, this paper...
The Atmospheric Radiation Measurement (ARM) Climate Research Facility (www.arm.gov) provides atmospheric observations from diverse climatic regimes around the world. Currently, ARM archives over 22 million user assessable data files, primarily stored in NetCDF file format, with total volumes close to one Petabyte. In this paper, we will discuss how is currently storing, distributing, cataloging and visualizing such large of multi-dimensional climate model also describe their future plan.
Reducing suicide incidence among US veterans is one of the highest priorities for Department Veterans Affairs (VA). We are implementing a risk detection system, in collaboration with VA, that would serve as surveillance system factors appearing clinical text data. Primary requirements this fast search capability, feature and information extraction, delivery data to up-stream natural language processing models. As such, we evaluating scalable storage solutions on basis performance, fault...
Abstract Objective To discover pharmacotherapy prescription patterns and their statistical associations with outcomes through a clinical pathway inference framework applied on real-world data. Materials Methods We apply machine learning steps in our using 2006 to 2020 cohort of veterans major depressive disorder (MDD). Outpatient antidepressant pharmacy emergency department visits, self-harm, all-cause mortality data were extracted from the Department Veterans Affairs Corporate Data...
ABSTRACT Objective The increasing availability of Electronic Health Record (EHR) systems has created enormous potential for translational research. Even with a working knowledge EHR, it is difficult to know all the relevant codes related phenotype due large number available. Traditional data mining approaches often require use patient-level data, which hinders ability share across institutions establish cooperative and integrated network. In this project, we demonstrate that multi-center...
To improve clinical care practice, it is important to understand the variability of pathways executed in different contexts (e.g., geographical locations, demographics, and phenotypic groups). A common way representing through network-based representations that capture trajectories treatment steps. However, first-order networks, which are based on Markovian property de facto standard model represent transitions between steps, often fail real trajectories. This paper introduces a visual...
Summary Objective Electronic health record (EHR) systems contain a wealth of clinical data stored as both codified and free-text narrative notes, covering hundreds thousands concepts available for research care. The complex, massive, heterogeneous, noisy nature EHR imposes significant challenges feature representation, information extraction, uncertainty quantification. To address these challenges, we proposed an efficient A ggregated na R rative C odified H ealth (ARCH) records analysis to...
The process of identifying a cohort interest is very challenging task. It requires manually inspecting many patient records complex structure that might include medical coding errors and missing data. This paper presents computational pipeline for refining the selection based on concepts recorded in electronic health (EHRs). extracts EHR data given normalizes this using standard vocabularies. Then stacked denoising autoencoder used to embed normalized vectors low dimensional space, where...