- Biomedical Text Mining and Ontologies
- Long-Term Effects of COVID-19
- COVID-19 Clinical Research Studies
- Bioinformatics and Genomic Networks
- Genomics and Rare Diseases
- COVID-19 and Mental Health
- Software Engineering Research
- Gene expression and cancer classification
- Software System Performance and Reliability
- Topic Modeling
- Advanced Graph Neural Networks
- Semantic Web and Ontologies
- Computational Drug Discovery Methods
- Metabolism, Diabetes, and Cancer
- Psychosomatic Disorders and Their Treatments
- Software Testing and Debugging Techniques
- Genomics and Chromatin Dynamics
- Natural Language Processing Techniques
- Multiple Sclerosis Research Studies
- Inflammasome and immune disorders
- Imbalanced Data Classification Techniques
- Carbohydrate Chemistry and Synthesis
- Lysosomal Storage Disorders Research
- Teaching and Learning Programming
- Plant Micronutrient Interactions and Effects
Jackson Laboratory
2018-2025
University of Massachusetts Amherst
2000-2016
University of Rochester
1996
Stony Brook University
1987
The Human Phenotype Ontology (HPO)—a standardized vocabulary of phenotypic abnormalities associated with 7000+ diseases—is used by thousands researchers, clinicians, informaticians and electronic health record systems around the world. Its detailed descriptions clinical computable disease definitions have made HPO de facto standard for deep phenotyping in field rare disease. HPO's interoperability other ontologies has enabled it to be improve diagnostic accuracy incorporating model organism...
Abstract In biology and biomedicine, relating phenotypic outcomes with genetic variation environmental factors remains a challenge: patient phenotypes may not match known diseases, candidate variants be in genes that haven’t been characterized, research organisms recapitulate human or veterinary affecting disease are unknown undocumented, many resources must queried to find potentially significant associations. The Monarch Initiative (https://monarchinitiative.org) integrates information on...
Stratification of patients with post-acute sequelae SARS-CoV-2 infection (PASC, or long COVID) would allow precision clinical management strategies. However, COVID is incompletely understood and characterised by a wide range manifestations that are difficult to analyse computationally. Additionally, the generalisability machine learning classification COVID-19 outcomes has rarely been tested.
Abstract The Human Phenotype Ontology (HPO) is a widely used resource that comprehensively organizes and defines the phenotypic features of human disease, enabling computational inference supporting genomic analyses through semantic similarity machine learning algorithms. HPO has widespread applications in clinical diagnostics translational research, including diagnostics, gene-disease discovery, cohort analytics. In recent years, groups around world have developed translations from English...
Integrated, up-to-date data about SARS-CoV-2 and COVID-19 is crucial for the ongoing response to pandemic by biomedical research community. While rich biological knowledge exists related viruses (SARS-CoV, MERS-CoV), integrating this difficult time-consuming, since much of it in siloed databases or textual format. Furthermore, required community vary drastically different tasks; optimal a machine learning task, example, from used populate browsable user interface clinicians. To address these...
Non-steroidal anti-inflammatory drugs (NSAIDs) are commonly used to reduce pain, fever, and inflammation but have been associated with complications in community-acquired pneumonia. Observations shortly after the start of COVID-19 pandemic 2020 suggested that ibuprofen was an increased risk adverse events patients, subsequent observational studies failed demonstrate one case showed reduced NSAID use.
Healthcare datasets obtained from Electronic Health Records have proven to be extremely useful for assessing associations between patients' predictors and outcomes of interest. However, these often suffer missing values in a high proportion cases, whose removal may introduce severe bias. Several multiple imputation algorithms been proposed attempt recover the information under an assumed missingness mechanism. Each algorithm presents strengths weaknesses, there is currently no consensus on...
Abstract Accurate stratification of patients with post-acute sequelae SARS-CoV-2 infection (PASC, or long COVID) would allow precision clinical management strategies. However, the natural history COVID is incompletely understood and characterized by an extremely wide range manifestations that are difficult to analyze computationally. In addition, generalizability machine learning classification COVID-19 outcomes has rarely been tested. We present a method for computationally modeling PASC...
We created an Eclipse plug-in called FrenchPress that partially automates the task of giving students feedback on their Java programs. It is designed not for novices but taking second or third course: who know enough to write a working program lack judgment recognize bad code when they see it. does diagnose compile-time runtime errors, logical errors produce incorrect output. targets silent flaws, flaws student unable identify himself because nothing in programming environment alerts him....
Abstract Background Defects in the glycosylphosphatidylinositol (GPI) biosynthesis pathway can result a group of congenital disorders glycosylation known as inherited GPI deficiencies (IGDs). To date, defects 22 29 genes have been identified IGDs. The early phase biosynthetic assembles anchor (Synthesis stage) and late transfers to nascent peptide endoplasmic reticulum (ER) (Transamidase stage), stabilizes ER membrane using fatty acid remodeling then traffics GPI-anchored protein cell...
Studies suggest that metformin is associated with reduced COVID-19 severity in individuals diabetes compared to other antihyperglycemics. We assessed if incidence of severe for patients prediabetes or polycystic ovary syndrome (PCOS), common diseases increase the risk COVID-19.This observational, retrospective study utilized EHR data from 52 hospitals PCOS treated levothyroxine/ondansetron (controls). After balancing via inverse probability score weighting, associations were by logistic...
Embeddings are semantically meaningful representations of words in a vector space, commonly used to enhance downstream machine learning applications. Traditional biomedical embedding techniques often replace all synonymous representing biological or medical concepts with unique token, ensuring consistent representation and improving quality. However, the potential impact replacing non-biomedical concept synonyms has received less attention. Embedding approaches employ replacement that span...
Abstract Background Non-steroidal anti-inflammatory drugs (NSAIDs) are commonly used to reduce pain, fever, and inflammation but have been associated with complications in community-acquired pneumonia. Observations shortly after the start of COVID-19 pandemic 2020 suggested that ibuprofen was an increased risk adverse events patients, subsequent observational studies failed demonstrate one case showed reduced NSAID use. Methods A 38-center retrospective cohort study performed leveraged...
Article Free Access Share on Textual data mining of service center call records Authors: Pang-Ning Tan Department Computer Science, University Minnesota, Minneapolis, MN MNView Profile , Hannah Blau Massachusetts, Amherst, MA MAView Steve Harp Honeywell Technology Center, 3660 Drive, Robert Goldman Authors Info & Claims KDD '00: Proceedings the sixth ACM SIGKDD international conference Knowledge discovery and miningAugust 2000Pages 417–423https://doi.org/10.1145/347090.347177Published:01...
Target enrichment combined with chromosome conformation capturing methodologies such as capture Hi-C (CHC) can be used to investigate spatial layouts of genomic regions high resolution and at scalable costs. A common application CHC is the investigation regulatory elements that are in contact promoters, but for a range other applications. Therefore, probe design needs adapted experimental needs, no flexible tool currently available this purpose. We present Java desktop called GOPHER...
Navigating the clinical literature to determine optimal management for rare diseases presents significant challenges. We introduce Medical Action Ontology (MAxO), an ontology specifically designed organize medical procedures, therapies, and interventions.
Recent proposals to apply data mining systems problems in law enforcement, national security, and fraud detection have attracted both media attention technical critiques of their expected accuracy impact on privacy. Unfortunately, the majority been based simplistic assumptions about data, classifiers, inference procedures, overall architecture such systems. We consider these detail, we construct a simulation model that more closely matches realistic show how privacy hypothetical system could...
Integrated, up-to-date data about SARS-CoV-2 and coronavirus disease 2019 (COVID-19) is crucial for the ongoing response to COVID-19 pandemic by biomedical research community. While rich biological knowledge exists related viruses (SARS-CoV, MERS-CoV), integrating this difficult time consuming, since much of it in siloed databases or textual format. Furthermore, required community varies drastically different tasks - optimal a machine learning task, example, from used populate browsable user...
Abstract Acute COVID-19 infection can be followed by diverse clinical manifestations referred to as Post Sequelae of SARS-CoV2 Infection (PASC). Studies have shown an increased risk being diagnosed with new-onset psychiatric disease following a diagnosis acute COVID-19. However, it was unclear whether non-psychiatric PASC-associated (PASC-AMs) are associated A retrospective electronic health record (EHR) cohort study 2,391,006 individuals performed evaluate PASC-AMs disease. Data were...
We identified a de novo heterozygous transient receptor potential cation channel subfamily M (melastatin) member 3 (
Summary Background COVID-19 has been shown to increase the risk of adverse mental health consequences. A recent electronic record (EHR)-based observational study showed an almost two-fold increased new-onset illness in first 90 days following a diagnosis acute COVID-19. Methods We used National COVID Cohort Collaborative, harmonized EHR repository with 2,965,506 positive patients, and compared cohorts patients comparable controls. Patients were propensity score-matched control for...
Abstract Inhibiting protein kinases (PKs) that cause cancers has been an important topic in cancer therapy for years. So far, almost 8% of >530 PKs have targeted by FDA-approved medications, and around 150 kinase inhibitors (PKIs) tested clinical trials. We present approach based on natural language processing machine learning to investigate the relations between cancers, predicting whose inhibition would be efficacious treat a certain cancer. Our represents as semantically meaningful...
We created an Eclipse plug-in called FrenchPress that offers students feedback on their Java programming style. It is designed not for novices but taking second or third course. Advanced beginner know enough to produce a program with the desired input/output behavior, fail understand it could still be poorly written. Large class sizes in introductory courses make difficult instructors give individualized attention. automates small subset of might have received from educators. The system...