Using Social Media to Help Understand Patient-Reported Health Outcomes of Post–COVID-19 Condition: Natural Language Processing Approach

Normalization Named Entity Recognition Sentiment Analysis
DOI: 10.2196/45767 Publication Date: 2023-06-05T06:20:20Z
ABSTRACT
While scientific knowledge of post-COVID-19 condition (PCC) is growing, there remains significant uncertainty in the definition disease, its expected clinical course, and impact on daily functioning. Social media platforms can generate valuable insights into patient-reported health outcomes as content produced at high resolution by patients caregivers, representing experiences that may be unavailable to most clinicians.In this study, we aimed determine validity effectiveness advanced natural language processing approaches built derive insight PCC-related from social Twitter Reddit. We extracted terms, including symptoms conditions, measured their occurrence frequency. compared outputs with human annotations tracked symptom term occurrences over time locations explore pipeline's potential a surveillance tool.We used bidirectional encoder representations transformers (BERT) models extract normalize PCC terms English posts 2 named entity recognition implemented 2-step normalization task map unique concepts standardized terminology. The steps were done using semantic search approach BERT biencoders. evaluated extracting human-annotated corpus proximity-based score. also reliability normalized web-based survey more than 3000 participants several countries.UmlsBERT-Clinical had highest accuracy predicting entities closest those annotators. Based our findings, top 3 commonly occurring groups systemic (such fatigue), neuropsychiatric anxiety brain fog), respiratory shortness breath). In addition, found novel not been categorized previous studies, such infection pain. Regarding co-occurring symptoms, pair fatigue headaches was among pairs across both platforms. temporal analysis, prevalent, followed category, Our spatial analysis concluded 42% (10,938/26,247) analyzed included location information, majority coming United States, Kingdom, Canada.The outcome media-derived pipeline comparable results peer-reviewed articles relevant symptoms. Overall, study provides information about patient's journey help care providers anticipate future needs.RR2-10.1101/2022.12.14.22283419.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (41)
CITATIONS (11)