Term Standardisation With LDA Model To Detect Service Disruption Events Using English And Manglish Tweets
lda
Electronic computers. Computer science
manglish
twitter
QA75.5-76.95
Information technology
multilingual
P Philology. Linguistics
T58.5-58.64
420
rapid transit
DOI:
10.33093/jiwe.2024.3.1.1
Publication Date:
2024-02-15T04:30:19Z
AUTHORS (5)
ABSTRACT
Rapid transit is one of Malaysia's most important transportation modes, where commuters use public transportation to travel. Any disruption in the rapid transit service affects their daily routines. Therefore, detecting such service disruption has become fundamental. In this study, the disruption in Malaysia's rapid transit service was assessed using English and Manglish (a combination of English and Malay) tweets through Latent Dirichlet Allocation (LDA). The gathered tweets were classified into event and non-event tweets and LDA was applied to the event tweets. Manglish event tweets were pre-processed using the proposed term standardisation technique. As a result, LDA has proved its efficiency in topic detection for both English and Manglish tweets with better performance for Manglish tweets; The best event detection rate of the LDA_English model was at the likelihood of 80% while the best detection rate of the LDA_Manglish model was at a likelihood of 60%.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (0)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....