NFDI4DS | UHH-SEMS - Publication Details

Hong Yu

ORCID: 0000-0001-9263-5035

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5017601806

Research Areas

Topic Modeling
Biomedical Text Mining and Ontologies
Natural Language Processing Techniques
Machine Learning in Healthcare
Advanced Text Analysis Techniques
Semantic Web and Ontologies
Multimodal Machine Learning Applications
Health Literacy and Information Accessibility
Electronic Health Records Systems
Health Sciences Research and Education
Text Readability and Simplification
Pharmacovigilance and Adverse Drug Reactions
Domain Adaptation and Few-Shot Learning
Genetics, Bioinformatics, and Biomedical Research
Artificial Intelligence in Healthcare
Sentiment Analysis and Opinion Mining
Data Quality and Management
Intelligent Tutoring Systems and Adaptive Learning
Mobile Health and mHealth Applications
Chronic Disease Management Strategies
Social Media in Health Education
Imbalanced Data Classification Techniques
Text and Document Classification Technologies
Genomics and Phylogenetic Studies
Computational Drug Discovery Methods

University of Massachusetts Lowell
2017-2025

Amherst College
2017-2025

University of Massachusetts Amherst
2016-2025

Shanghai Electric (China)
2024-2025

VA New England Healthcare System
2021-2025

University of Massachusetts Chan Medical School
2015-2024

Chongqing University of Posts and Telecommunications
2024

United States Department of Veterans Affairs
2022-2024

UMass Memorial Medical Center
2023-2024

Edith Nourse Rogers Memorial Veterans Hospital
2016-2023

Towards answering opinion questions

OPENALEX - Publications

Hong Yu Vasileios Hatzivassiloglou

Opinion question answering is a challenging task for natural language processing. In this paper, we discuss necessary component an opinion system: separating opinions from fact, at both the document and sentence level. We present Bayesian classifier discriminating between documents with preponderance of such as editorials regular news stories, describe three unsupervised, statistical techniques significantly harder detecting also first model classifying sentences positive or negative in...

10.3115/1119355.1119372 article EN Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2003-01-01

Bidirectional RNN for Medical Event Detection in Electronic Health Records

OPENALEX - Publications

Abhyuday Jagannatha Hong Yu

Sequence labeling for extraction of medical events and their attributes from unstructured text in Electronic Health Record (EHR) notes is a key step towards semantic understanding EHRs. It has important applications health informatics including pharmacovigilance drug surveillance. The state the art supervised machine learning models this domain are based on Conditional Random Fields (CRFs) with features calculated fixed context windows. In application, we explored recurrent neural network...

10.18653/v1/n16-1056 article EN Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2016-01-01

AskHERMES: An online question answering system for complex clinical questions

OPENALEX - Publications

Yong-gang Cao Feifan Liu Pippa Simpson Lamont Antieau Andrew Bennett and 3 more

10.1016/j.jbi.2011.01.004 article EN publisher-specific-oa Journal of Biomedical Informatics 2011-01-22

Structured prediction models for RNN based sequence labeling in clinical text

OPENALEX - Publications

Abhyuday Jagannatha Hong Yu

Sequence labeling is a widely used method for named entity recognition and information extraction from unstructured natural language data. In clinical domain one major application of sequence involves medical entities such as medication, indication, side-effects Electronic Health Record narratives. in this domain, presents its own set challenges objectives. work we experimented with various CRF based structured learning models Recurrent Neural Networks. We extend the previously studied...

10.18653/v1/d16-1082 preprint EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2016-01-01

Fine-Tuning Bidirectional Encoder Representations From Transformers (BERT)–Based Models on Large-Scale Electronic Health Record Notes: An Empirical Study

OPENALEX - Publications

Fei Li Yonghao Jin Weisong Liu Bhanu Pratap Singh Rawat Pengshan Cai and 1 more

Background The bidirectional encoder representations from transformers (BERT) model has achieved great success in many natural language processing (NLP) tasks, such as named entity recognition and question answering. However, little prior work explored this to be used for an important task the biomedical clinical domains, namely normalization. Objective We aim investigate effectiveness of BERT-based models or In addition, our second objective is whether domains training data influence...

10.2196/14830 article EN cc-by JMIR Medical Informatics 2019-09-12

ICD Coding from Clinical Text Using Multi-Filter Residual Convolutional Neural Network

OPENALEX - Publications

Fei Li Hong Yu

Automated ICD coding, which assigns the International Classification of Disease codes to patient visits, has attracted much research attention since it can save time and labor for billing. The previous state-of-the-art model utilized one convolutional layer build document representations predicting codes. However, lengths grammar text fragments, are closely related vary a lot in different documents. Therefore, flat fixed-length architecture may not be capable learning good representations....

10.1609/aaai.v34i05.6331 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2020-04-03

Mental-LLM

OPENALEX - Publications

Xuhai Xu Bingsheng Yao Yuanzhe Dong Saadia Gabriel Hong Yu and 4 more

Advances in large language models (LLMs) have empowered a variety of applications. However, there is still significant gap research when it comes to understanding and enhancing the capabilities LLMs field mental health. In this work, we present comprehensive evaluation multiple on various health prediction tasks via online text data, including Alpaca, Alpaca-LoRA, FLAN-T5, GPT-3.5, GPT-4. We conduct broad range experiments, covering zero-shot prompting, few-shot instruction fine-tuning. The...

10.1145/3643540 article EN Proceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies 2024-03-06

Unveiling GPT-4V's hidden challenges behind high accuracy on USMLE questions: Observational Study

OPENALEX - Publications

Zhichao Yang Zonghai Yao Mahbuba Tasmin Parth Vashisht Won Seok Jang and 5 more

Recent advancements in artificial intelligence, such as GPT-3.5 Turbo (OpenAI) and GPT-4, have demonstrated significant potential by achieving good scores on text-only United States Medical Licensing Examination (USMLE) exams effectively answering questions from physicians. However, the ability of these models to interpret medical images remains underexplored. This study aimed comprehensively evaluate performance, interpretability, limitations Turbo, its successor, GPT-4 Vision (GPT-4V),...

10.2196/65146 article EN cc-by Journal of Medical Internet Research 2025-02-07

GeneWays: a system for extracting, analyzing, visualizing, and integrating molecular pathway data

OPENALEX - Publications

Andrey Rzhetsky Ivan Iossifov Tomohiro Koike Michael Krauthammer Pauline Kra and 7 more

10.1016/j.jbi.2003.10.001 article EN publisher-specific-oa Journal of Biomedical Informatics 2003-12-05

Comparison of Cellular Binding and Uptake of Antisense Phosphodiester, Phosphorothioate, and Mixed Phosphorothioate and Methylphosphonate Oligonucleotides

OPENALEX - Publications

Qiuyan Zhao Sara Matson Charles Herrera Eric F. Fisher Hong Yu and 1 more

The effects of phosphorothioate (S-oligonucleotide) or terminal phosphorothioate-phosphodiester (S-O-oligonucleotides) methylphosphonate-phosphodiester (MP-O-oligonucleotides) modifications on mouse spleen cell surface binding, uptake, and degradation were studied using fluorescein (FITC)-conjugated oligonucleotides. S-oligonucleotides had the highest binding followed by S-O-, O-, MP-O-oligonucleotides. Competition studies indicated that have an increased affinity for membrane...

10.1089/ard.1993.3.53 article EN Antisense Research and Development 1993-01-01

TransformEHR: transformer-based encoder-decoder generative model to enhance prediction of disease outcomes using electronic health records

OPENALEX - Publications

Zhichao Yang Avijit Mitra Weisong Liu Dan R. Berlowitz Hong Yu

Abstract Deep learning transformer-based models using longitudinal electronic health records (EHRs) have shown a great success in prediction of clinical diseases or outcomes. Pretraining on large dataset can help such map the input space better and boost their performance relevant tasks through finetuning with limited data. In this study, we present TransformEHR, generative encoder-decoder model transformer that is pretrained new pretraining objective—predicting all outcomes patient at...

10.1038/s41467-023-43715-z article EN cc-by Nature Communications 2023-11-29

Associations Between Natural Language Processing–Enriched Social Determinants of Health and Suicide Death Among US Veterans

OPENALEX - Publications

Avijit Mitra Richeek Pradhan Rachel Melamed Kun Chen David C. Hoaglin and 6 more

Importance Social determinants of health (SDOHs) are known to be associated with increased risk suicidal behaviors, but few studies use SDOHs from unstructured electronic record notes. Objective To investigate associations between veterans’ death by suicide and recent SDOHs, identified using structured data. Design, Setting, Participants This nested case-control study included veterans who received care under the US Veterans Health Administration October 1, 2010, September 30, 2015. A...

10.1001/jamanetworkopen.2023.3079 article EN cc-by-nc-nd JAMA Network Open 2023-03-15

Mental-LLM: Leveraging Large Language Models for Mental Health Prediction via Online Text Data

OPENALEX - Publications

Xuhai Xu Bingsheng Yao Yuanzhe Dong Saadia Gabriel Hong Yu and 4 more

Advances in large language models (LLMs) have empowered a variety of applications. However, there is still significant gap research when it comes to understanding and enhancing the capabilities LLMs field mental health. In this work, we present first comprehensive evaluation multiple LLMs, including Alpaca, Alpaca-LoRA, FLAN-T5, GPT-3.5, GPT-4, on various health prediction tasks via online text data. We conduct broad range experiments, covering zero-shot prompting, few-shot instruction...

10.48550/arxiv.2307.14385 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Mapping Abbreviations to Full Forms in Biomedical Articles

OPENALEX - Publications

Hong Yu

Objective: To develop methods that automatically map abbreviations to their full forms in biomedical articles. Methods: The authors developed two of mapping defined and undefined (defined are paired with the articles, whereas ones not). For abbreviations, they a set pattern-matching rules an abbreviation its form implemented into software program, AbbRE (for "abbreviation recognition extraction"). Using opinions domain experts as reference standard, evaluated recall precision for ten...

10.1197/jamia.m0913 article EN Journal of the American Medical Informatics Association 2002-05-01

Automatically classifying sentences in full-text biomedical articles into Introduction, Methods, Results and Discussion

OPENALEX - Publications

Shashank Agarwal Hong Yu

Abstract Biomedical texts can be typically represented by four rhetorical categories: Introduction, Methods, Results and Discussion (IMRAD). Classifying sentences into these categories benefit many other text-mining tasks. Although studies have applied different approaches for automatically classifying in MEDLINE abstracts the IMRAD categories, few explored classification of that appear full-text biomedical articles. We first evaluated whether articles could reliably annotated format then...

10.1093/bioinformatics/btp548 article EN Bioinformatics 2009-09-25

Development, implementation, and a cognitive evaluation of a definitional question answering system for physicians

OPENALEX - Publications

Hong Yu Minsuk Lee David R. Kaufman John Ely Jerome A. Osheroff and 2 more

10.1016/j.jbi.2007.03.002 article EN publisher-specific-oa Journal of Biomedical Informatics 2007-03-31

Biomedical negation scope detection with conditional random fields

OPENALEX - Publications

Shashank Agarwal Hong Yu

Objective Negation is a linguistic phenomenon that marks the absence of an entity or event.Negated events are frequently reported in both biological literature and clinical notes.Text mining applications benefit from detection negation its scope.However, due to complexity language, identifying scope sentence not trivial task.Design Conditional random fields (CRF), supervised machine-learning algorithm, were used train models detect cue phrases their notes.The trained on publicly available...

10.1136/jamia.2010.003228 article EN Journal of the American Medical Informatics Association 2010-10-20

The biomedical discourse relation bank

OPENALEX - Publications

Rashmi Prasad Susan McRoy Nadya Frid Aravind K. Joshi Hong Yu

Identification of discourse relations, such as causal and contrastive between situations mentioned in text is an important task for biomedical text-mining. A corpus annotated with relations would be very useful developing evaluating methods processing. However, little effort has been made to develop resource.We have developed the Biomedical Discourse Relation Bank (BioDRB), which we explicit implicit 24 open-access full-text articles from GENIA corpus. Guidelines annotation were adapted Penn...

10.1186/1471-2105-12-188 article EN cc-by BMC Bioinformatics 2011-05-23

Clinical Relation Extraction Toward Drug Safety Surveillance Using Electronic Health Record Narratives: Classical Learning Versus Deep Learning

OPENALEX - Publications

Tsendsuren Munkhdalai Feifan Liu Hong Yu

Medication and adverse drug event (ADE) information extracted from electronic health record (EHR) notes can be a rich resource for safety surveillance. Existing observational studies have mainly relied on structured EHR data to obtain ADE information; however, ADEs are often buried in the narratives not recorded data.To unlock ADE-related narratives, there is need extract relevant entities identify relations among them. In this study, we focus relation identification. This study aimed...

10.2196/publichealth.9361 article EN cc-by JMIR Public Health and Surveillance 2018-04-25

Extraction of Information Related to Adverse Drug Events from Electronic Health Record Notes: Design of an End-to-End Model Based on Deep Learning

OPENALEX - Publications

Fei Li Weisong Liu Hong Yu

Pharmacovigilance and drug-safety surveillance are crucial for monitoring adverse drug events (ADEs), but the main ADE-reporting systems such as Food Drug Administration Adverse Event Reporting System face challenges underreporting. Therefore, complementary surveillance, data on ADEs extracted from electronic health record (EHR) notes via natural language processing (NLP). As NLP develops, many up-to-date machine-learning techniques introduced in this field, deep learning multi-task (MTL)....

10.2196/12159 article EN cc-by JMIR Medical Informatics 2018-11-09

Coming Soon ...