Automatic Categorization of LGBT User Profiles on Twitter with Machine Learning
03 medical and health sciences
machine learning
0302 clinical medicine
330
LGBT
5. Gender equality
social media
Twitter
Library and Information Science
300
DOI:
10.3390/electronics10151822
Publication Date:
2021-07-29T14:47:46Z
AUTHORS (6)
ABSTRACT
Privacy needs and stigma pose significant barriers to lesbian, gay, bisexual, and transgender (LGBT) people sharing information related to their identities in traditional settings and research methods such as surveys and interviews. Fortunately, social media facilitates people’s belonging to and exchanging information within online LGBT communities. Compared to heterosexual respondents, LGBT users are also more likely to have accounts on social media websites and access social media daily. However, the current relevant LGBT studies on social media are not efficient or assume that any accounts that utilize LGBT-related words in their profile belong to individuals who identify as LGBT. Our human coding of over 16,000 accounts instead proposes the following three categories of LGBT Twitter users: individual, sexual worker/porn, and organization. This research develops a machine learning classifier based on the profile and bio features of these Twitter accounts. To have an efficient and effective process, we use a feature selection method to reduce the number of features and improve the classifier’s performance. Our approach achieves a promising result with around 88% accuracy. We also develop statistical analyses to compare the three categories based on the average weight of top features.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (82)
CITATIONS (6)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....