- Complex Network Analysis Techniques
- Misinformation and Its Impacts
- Social Media and Politics
- Opinion Dynamics and Social Influence
- Persona Design and Applications
- Hate Speech and Cyberbullying Detection
- Innovative Human-Technology Interaction
- Service and Product Innovation
- Topic Modeling
- Spam and Phishing Detection
- Digital Games and Media
- Media Influence and Politics
- Sentiment Analysis and Opinion Mining
- Technology Use by Older Adults
- Digital Marketing and Social Media
- Advanced Malware Detection Techniques
- Human Mobility and Location-Based Analysis
- Media Studies and Communication
- Computational and Text Analysis Methods
- Natural Language Processing Techniques
- Academic Publishing and Open Access
- Software Engineering Research
- Recommender Systems and Techniques
- Fluid Dynamics and Turbulent Flows
- Social Media in Health Education
Indiana University Bloomington
2023-2025
Singapore Management University
2020-2023
Tokyo Institute of Technology
2023
Indiana University
2023
Telefonica Research and Development
2013-2021
Hamad bin Khalifa University
2016-2020
Qatar Airways (Qatar)
2015-2020
Association for Computing Machinery
2020
IT University of Copenhagen
2019
Qatar Cardiovascular Research Center
2016-2017
Twitter, a microblogging service less than three years old, commands more 41 million users as of July 2009 and is growing fast. Twitter tweet about any topic within the 140-character limit follow others to receive their tweets. The goal this paper study topological characteristics its power new medium information sharing.
User Generated Content (UGC) is re-shaping the way people watch video and TV, with millions of producers consumers. In particular, UGC sites are creating new viewing patterns social interactions, empowering users to be more creative, developing business opportunities. To better understand impact systems, we have analyzed YouTube, world's largest VoD system. Based on a large amount data collected, provide an in-depth study YouTube other similar systems. popularity life-cycle videos, intrinsic...
Social networking services are a fast-growing business in the Internet. However, it is unknown if online relationships and their growth patterns same as real-life social networks. In this paper, we compare structures of three services: Cyworld, MySpace, orkut, each with more than 10 million users, respectively. We have access to complete data Cyworld's ilchon (friend) analyze its degree distribution, clustering property, correlation, evolution over time. also use Cyworld evaluate validity...
User generated content (UGC), now with millions of video producers and consumers, is reshaping the way people watch TV. In particular, UGC sites are creating new viewing patterns social interactions, empowering users to be more creative, generating business opportunities. Compared traditional video-on-demand (VoD) systems, services allow request videos from a potentially unlimited selection in an asynchronous fashion. To better understand impact services, we have analyzed world's largest VoD...
In this work we explore cyberbullying and other toxic behavior in team competition online games. Using a dataset of over 10 million player reports on 1.46 players along with corresponding crowdsourced decisions, test several hypotheses drawn from theories explaining behavior. Besides providing large-scale, empirical based understanding behavior, our can be used as basis for building systems to detect, prevent, counter-act
Over the past few years, a number of new "fringe" communities, like 4chan or certain subreddits, have gained traction on Web at rapid pace. However, more often than not, little is known about how they evolve what kind activities attract, despite recent research has shown that influence false information reaches mainstream communities. This motivates need to monitor these communities and analyze their impact Web's ecosystem. In August 2016, social network called Gab was created as an...
Recent studies have alarmed that many online hate speeches are implicit. With its subtle nature, the explainability of detection such hateful speech has been a challenging problem. In this work, we examine whether ChatGPT can be used for providing natural language explanations (NLEs) implicit detection. We design our prompt to elicit concise ChatGPT-generated NLEs and conduct user evaluate their qualities by comparison with human-written NLEs. discuss potential limitations in context research.
Abstract The emergence of generative AI has sparked substantial discussions, with the potential to have profound impacts on society in all aspects. As emerging technologies continue advance, it is imperative facilitate their proper integration into society, managing expectations and fear. This paper investigates users’ perceptions using 3M posts Twitter from January 2019 March 2023, especially focusing occupation usage. We find that people across various occupations, not just IT-related...
Online social networking services are among the most popular Internet according to Alexa.com and have become a key feature in many services. Users interact through various features of online services: making friend relationships, sharing their photos, writing comments. These relationships expected other web services, such as recommendation engines, security measures, search, personalization issues. However, we very limited knowledge on how much interaction actually takes place over declared...
Mobile instant messaging (e.g., via SMS or WhatsApp) often goes along with an expectation of high attentiveness, i.e., that the receiver will notice and read message within a few minutes. Hence, existing services for mobile phones share indicators availability, such as last time user has been online. However, in this paper we not only provide evidence these cues create social pressure, but they are also weak predictors attentiveness. As remedy, propose to machine-computed prediction whether...
Online social media platforms generally attempt to mitigate hateful expressions, as these comments can be detrimental the health of community. However, automatically identifying challenging. We manually label 5,143 expressions posted YouTube and Facebook videos among a dataset 137,098 from an online news media. then create granular taxonomy different types targets hate train machine learning models detect classify in full dataset. Our contribution is twofold: 1) creating for that includes...
We analyze the dynamics of behavior known as 'unfollow' in Twitter. collected daily snapshots online relationships 1.2 million Korean-speaking users for 51 days well all their tweets. found that Twitter frequently unfollow. then discover major factors, including reciprocity relationships, duration a relationship, followees' informativeness, and overlap which affect decision to conduct interview with 22 Korean respondents supplement quantitative results.
Nearest-neighbor collaborative filtering provides a successful means of generating recommendations for web users. However, this approach suffers from several shortcomings, including data sparsity and noise, the cold-start problem, scalability. In work, we present novel method recommending items to users based on expert opinions. Our is variation traditional filtering: rather than applying nearest neighbor algorithm user-rating data, predictions are computed using set neighbors an independent...
The Middle East respiratory syndrome coronavirus (MERS-CoV) was exported to Korea in 2015, resulting a threat neighboring nations. We evaluated the possibility of using digital surveillance system based on web searches and social media data monitor this MERS outbreak. collected number daily laboratory-confirmed cases quarantined from May 11, 2015 June 26, Korean government portal. trends observed via Google search Twitter during same time period were also ascertained Trends Topsy....
Twitter offers an explicit mechanism to facilitate information diffusion and has emerged as a new medium for communication. Many approaches find influentials have been proposed, but they do not consider the temporal order of adoption. In this work, we propose novel method by considering both link structure adoption in Twitter. Our finds distinct who are discovered other methods.
One problem facing players of competitive games is negative, or toxic, behavior. League Legends, the largest eSport game, uses a crowdsourcing platform called Tribunal to judge whether reported toxic player should be punished not. The two stage system requiring reports from those that directly observe behavior, and human experts review aggregated reports. While this has successfully dealt with vague nature behavior by majority rules based on many votes, it naturally requires tremendous cost,...
A growing number of people are changing the way they consume news, replacing traditional physical newspapers and magazines by their virtual online versions or/and weblogs. The interactivity immediacy present in news being produced exposed media corporations. News websites have to create effective strategies catch people's attention attract clicks. In this paper we investigate possible used corporations design headlines. We analyze content 69,907 headlines four major global during a minimum...
In this research, we conceptually examine the use of personas in an age large-scale online analytics data. Based on criticism and benefits outlined prior work by practitioners working with data, formulate major arguments for against given real-time data about customers, analyze these arguments, demonstrate areas productive employment data-driven leveraging their creation. Our key tenet is that are located between aggregated individual customer statistics. At best, digital capture coverage...
We develop a methodology to automate creating imaginary people, referred as personas, by processing complex behavioral and demographic data of social media audiences. From popular account containing more than 30 million interactions viewers from 198 countries engaging with 4,200 online videos produced global corporation, we demonstrate that our has several novel accomplishments, including: (a) identifying distinct user segments based on the content consumption patterns; (b) impactful...
A growing number of people are changing the way they consume news, replacing traditional physical newspapers and magazines by their virtual online versions or/and weblogs. The interactivity immediacy present in news being produced exposed media corporations. News websites have to create effective strategies catch people’s attention attract clicks. In this paper we investigate possible used corporations design headlines. We analyze content 69,907 headlines four major global during a minimum...
ChatGPT, the first large language model with mass adoption, has demonstrated remarkableperformance in numerous natural tasks. Despite its evident usefulness, evaluatingChatGPT's performance diverse problem domains remains challenging due to closednature of and continuous updates via Reinforcement Learning from HumanFeedback (RLHF). We highlight issue data contamination ChatGPT evaluations, a case study stance detection. discuss challenge preventing ensuring fair evaluation age closed...
We develop a methodology for persona generation using real time social media data the distribution of products via online platforms. From large account containing more than 30 million interactions from users 181 countries engaging with 4,200 digital produced by global corporation, we demonstrate that our can first identify both distinct and impactful user segments then create descriptions automatically adding pertinent features, such as names, photos, personal attributes. validate approach...
In this research, we evaluate four widely used face detection tools, which are Face++, IBM Bluemix Visual Recognition, AWS Rekognition, and Microsoft Azure Face API, using multiple datasets to determine their accuracy in inferring user attributes, including gender, race, age. Results show that the tools generally proficient at determining with rates greater than 90%, except for Bluemix. Concerning only one of provides capability, an rate although evaluation was performed on a high-quality...
Online platforms, such as Facebook, Twitter, and Reddit, provide users with a rich set of features for sharing consuming political information, expressing opinions, exchanging potentially contrary views. In activities, two types communication spaces naturally emerge: those dominated by exchanges between politically homogeneous that allow encourage crosscutting in heterogeneous groups. While research on talk online environments abounds, we know surprisingly little about the varying nature...