Twitter mining using semi-supervised classification for relevance filtering in syndromic surveillance

Relevance
DOI: 10.1371/journal.pone.0210689 Publication Date: 2019-07-18T17:28:59Z
ABSTRACT
We investigate the use of Twitter data to deliver signals for syndromic surveillance in order assess its ability augment existing efforts and give a better understanding symptomatic people who do not seek healthcare advice directly. focus on specific syndrome-asthma/difficulty breathing. outline collection using streaming API as well analysis pre-processing collected data. Even with keyword-based collection, many tweets are be relevant because they represent chatter, or talk awareness instead an individual suffering particular condition. In light this, we set out identify collect strong reliable signal. For text classification techniques, semi-supervised techniques since enable us more while only doing very minimal labelling. this paper, propose approach tweet relevance filtering. also alternative popular deep learning approaches. Additionally, highlight emojis other special features capturing tweet's tone improve performance. Our results show that negative those denote laughter provide best performance conjunction simple word-level n-gram approach. obtain good classifying both supervised algorithms found proposed preserve may advantageous context weak Finally, some correlation (r = 0.414, p 0.0004) between signal generated system from consultations related health conditions.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (61)
CITATIONS (24)