Son Doan

ORCID: 0000-0002-7284-1306
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Biomedical Text Mining and Ontologies
  • Data-Driven Disease Surveillance
  • Topic Modeling
  • Text and Document Classification Technologies
  • Advanced Text Analysis Techniques
  • Misinformation and Its Impacts
  • Semantic Web and Ontologies
  • Image Retrieval and Classification Techniques
  • Natural Language Processing Techniques
  • Wikis in Education and Collaboration
  • Sentiment Analysis and Opinion Mining
  • Electronic Health Records Systems
  • Rough Sets and Fuzzy Logic
  • Data Management and Algorithms
  • Mental Health via Writing
  • Data Mining Algorithms and Applications
  • Geographic Information Systems Studies
  • Algorithms and Data Compression
  • Zoonotic diseases and public health
  • Spam and Phishing Detection
  • Public Relations and Crisis Communication
  • Academic integrity and plagiarism
  • Ophthalmology and Visual Health Research
  • Pharmacovigilance and Adverse Drug Reactions
  • Pericarditis and Cardiac Tamponade

Société Française d'Allergologie
2020

National Institute of Informatics
2007-2019

Kaiser Permanente San Diego Medical Center
2019

Kaiser Permanente
2015-2018

University of California, San Diego
2012-2017

Southern California University for Professional Studies
2014

University of Southern California
2014

Vanderbilt University
2010

Vanderbilt University Medical Center
2009

Research Organization of Information and Systems
2008-2009

Medication information is one of the most important types clinical data in electronic medical records. It critical for healthcare safety and quality, as well research that uses record data. However, medication are often recorded notes free-text. As such, they not accessible to other computerized applications rely on coded We describe a new natural language processing system (MedEx), which extracts from notes. MedEx was initially developed using discharge summaries. An evaluation set 50...

10.1197/jamia.m3378 article EN Journal of the American Medical Informatics Association 2010-01-01

Abstract Summary: BioCaster is an ontology-based text mining system for detecting and tracking the distribution of infectious disease outbreaks from linguistic signals on Web. The continuously analyzes documents reported over 1700 RSS feeds, classifies them topical relevance plots onto a Google map using geocoded information. background knowledge bridging gap between Layman's terms formal-coding systems contained in freely available ontology which includes information eight languages focused...

10.1093/bioinformatics/btn534 article EN Bioinformatics 2008-10-15

This article describes the patient-centered Scalable National Network for Effectiveness Research (pSCANNER), which is part of recently formed PCORnet, a national network composed learning healthcare systems and patient-powered research networks funded by Patient Centered Outcomes Institute (PCORI). It designed to be stakeholder-governed federated that uses distributed architecture integrate data from three existing covering over 21 million patients in all 50 states: (1) VA Informatics...

10.1136/amiajnl-2014-002751 article EN cc-by-nc Journal of the American Medical Informatics Association 2014-04-30

Self-reported patient data has been shown to be a valuable knowledge source for post-market pharmacovigilance. In this paper we propose using the popular micro-blogging service Twitter gather evidence about adverse drug reactions (ADRs) after firstly having identified micro-blog messages (also know as "tweets") that report first-hand experience. order achieve goal explore machine learning with crowdsourced from laymen annotators. With help of lay annotators recruited CrowdFlower manually...

10.1016/j.jbi.2015.11.004 article EN cc-by-nc-nd Journal of Biomedical Informatics 2015-11-07

Twitter messages (tweets) contain various types of topics in our daily life, which include health-related topics. Analysis tweets would help us understand health conditions and concerns encountered lives. In this paper we evaluate an approach to extracting causalities from using natural language processing (NLP) techniques. Lexico-syntactic patterns based on dependency parser outputs are used for causality extraction. We focused three topics: “stress”, “insomnia”, “headache.” A large dataset...

10.1186/s12911-019-0785-0 article EN cc-by BMC Medical Informatics and Decision Making 2019-04-01

Objective: To develop an automated system to extract medications and related information from discharge summaries as part of the 2009 i2b2 natural language processing (NLP) challenge. This task required accurate recognition medication name, dosage, mode, frequency, duration, reason for drug administration. Design: We developed integrated using several existing NLP components at Vanderbilt University Medical Center, which included MedEx (to information), SecTag (a section identification...

10.1136/jamia.2010.003855 article EN Journal of the American Medical Informatics Association 2010-09-01

Delayed diagnosis of Kawasaki disease (KD) may lead to serious cardiac complications. We sought create and test the performance a natural language processing (NLP) tool, KD-NLP, in identification emergency department (ED) patients for whom KD should be considered.We developed an NLP tool that recognizes diagnostic criteria based on standard clinical terms medical word usage using 22 pediatric ED notes augmented by Unified Medical Language System vocabulary. With high suspicion defined as...

10.1111/acem.12925 article EN Academic Emergency Medicine 2016-01-30

10.1016/j.ijmedinf.2009.03.010 article EN International Journal of Medical Informatics 2009-05-16

Extraction of clinical information such as medications or problems from text is an important task natural language processing (NLP). Rule-based methods are often used in NLP systems because they easy to adapt and customize. Recently, supervised machine learning have proven be effective well. However, combining different classifiers further improve the performance entity recognition has not been investigated extensively. Combining into ensemble classifier presents both challenges...

10.1186/1472-6947-12-36 article EN cc-by BMC Medical Informatics and Decision Making 2012-05-07

Systems that exploit publicly available user generated content such as Twitter messages have been successful in tracking seasonal influenza. We developed a novel filtering method for Influenza-Like-Ilnesses (ILI)-related using 587 million from micro-blogs. first filtered based on syndrome keywords the BioCaster Ontology, an extant knowledge model of laymen's terms. then according to semantic features negation, hashtags, emoticons, humor and geography. The data covered 36 weeks US 2009...

10.1109/hisb.2012.21 preprint EN 2012-09-01

Background: Stress is a contributing factor to many major health problems in the United States, such as heart disease, depression, and autoimmune diseases. Relaxation often recommended mental treatment frontline strategy reduce stress, thereby improving conditions. Objective: The objective of our study was understand how people express their feelings stress relaxation through Twitter messages. Methods: We first performed qualitative content analysis 1326 781 tweets containing keywords...

10.2196/publichealth.5939 article EN cc-by JMIR Public Health and Surveillance 2017-06-13

In an era when most of our life activities are digitized and recorded, opportunities abound to gain insights about population health. Online product reviews present a unique data source that is currently underexplored. Health-related information, although scarce, can be systematically mined in online reviews. Leveraging natural language processing machine learning tools, we were able mine 1.3 million grocery for health-related information. The objectives the study as follows: (1) conduct...

10.4137/bii.s37791 article EN Biomedical Informatics Insights 2016-01-01

10.1016/j.jbi.2008.12.009 article EN publisher-specific-oa Journal of Biomedical Informatics 2009-01-02

In a growing interdisciplinary field like biomedical informatics, information dissemination and citation trends are changing rapidly due to many factors. To understand these factors better, we analyzed the evolution of number articles per major informatics topic, download/online view frequencies, patterns (using Web Science) for published from 2009 2012 in JAMIA. The JAMIA increased significantly 2012, there were some topic differences last 4 years. Medical Record Systems, Algorithms,...

10.1136/amiajnl-2013-002429 article EN cc-by-nc-nd Journal of the American Medical Informatics Association 2013-11-08

Objective: The 2022 n2c2 NLP Challenge posed identification of social determinants health (SDOH) in clinical narratives. We present three systems that we developed for the and discuss distinctive task formulation used each systems. Materials Methods: first system identifies target pieces information independently using machine learning classifiers. second uses a large language model (LLM) to extract complete structured outputs per document. third extracts candidate phrases relations with...

10.48550/arxiv.2301.11386 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Identifying serious infectious disease outbreaks in their early stages is an important task, both for national governments and international organizations like the World Health Organization. Text mining information extraction systems can provide important, low cost timely warning system these circumstances by identifying first signs of outbreak automatically from online textual news. One interesting characteristic reports --- which to best our knowledge has not been studied before use...

10.3115/1572364.1572384 article EN 2009-01-01

The database of genotypes and phenotypes (dbGaP) developed by the National Center for Biotechnology Information (NCBI) is a resource that contains information on various genome-wide association studies (GWAS) currently available via NCBI's dbGaP Entrez interface. an important resource, providing GWAS data can be used new exploratory research or cross-study validation authorized users. However, finding relevant to particular phenotype interest challenging, as presented in non-standardized...

10.1136/amiajnl-2013-001882 article EN cc-by-nc-nd Journal of the American Medical Informatics Association 2013-08-30
Coming Soon ...