Automatic Categorization of LGBT User Profiles on Twitter with Machine Learning

03 medical and health sciences machine learning 0302 clinical medicine 330 LGBT 5. Gender equality social media Twitter Library and Information Science 300
DOI: 10.3390/electronics10151822 Publication Date: 2021-07-29T14:47:46Z
ABSTRACT
Privacy needs and stigma pose significant barriers to lesbian, gay, bisexual, and transgender (LGBT) people sharing information related to their identities in traditional settings and research methods such as surveys and interviews. Fortunately, social media facilitates people’s belonging to and exchanging information within online LGBT communities. Compared to heterosexual respondents, LGBT users are also more likely to have accounts on social media websites and access social media daily. However, the current relevant LGBT studies on social media are not efficient or assume that any accounts that utilize LGBT-related words in their profile belong to individuals who identify as LGBT. Our human coding of over 16,000 accounts instead proposes the following three categories of LGBT Twitter users: individual, sexual worker/porn, and organization. This research develops a machine learning classifier based on the profile and bio features of these Twitter accounts. To have an efficient and effective process, we use a feature selection method to reduce the number of features and improve the classifier’s performance. Our approach achieves a promising result with around 88% accuracy. We also develop statistical analyses to compare the three categories based on the average weight of top features.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (82)
CITATIONS (6)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....