Systematic Review: Text Processing Algorithms in Machine Learning and Deep Learning for Mental Health Detection on Social Media

FOS: Computer and information sciences Computer Science - Machine Learning Computer Science - Computation and Language Computation and Language (cs.CL) Machine Learning (cs.LG)
DOI: 10.48550/arxiv.2410.16204 Publication Date: 2024-10-21
ABSTRACT
The global rise in depression necessitates innovative detection methods for early intervention. Social media provides a unique opportunity to identify through user-generated posts. This systematic review evaluates machine learning (ML) models on social media, focusing biases and methodological challenges throughout the ML lifecycle. A search of PubMed, IEEE Xplore, Google Scholar identified 47 relevant studies published after 2010. Prediction model Risk Of Bias ASsessment Tool (PROBAST) was utilized assess quality risk bias. Significant impacting reliability generalizability were found. There is predominant reliance Twitter (63.8%) English-language content (over 90%), with most users from United States Europe. Non-probability sampling (approximately 80%) limit representativeness. Only 23% explicitly addressed linguistic nuances like negations, crucial accurate sentiment analysis. Inconsistent hyperparameter tuning observed, only 27.7% properly models. About 17% did not adequately partition data into training, validation, test sets, risking overfitting. While 74.5% used appropriate evaluation metrics imbalanced data, others relied accuracy without addressing class imbalance, potentially skewing results. Reporting transparency varied, often lacking critical details. These findings highlight need diversify sources, standardize preprocessing protocols, ensure consistent development practices, address enhance reporting transparency. By overcoming these challenges, future research can develop more robust generalizable contributing improved mental health outcomes globally.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()