Haewoon Kwak

ORCID: 0000-0003-1418-0834
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Complex Network Analysis Techniques
  • Misinformation and Its Impacts
  • Social Media and Politics
  • Opinion Dynamics and Social Influence
  • Persona Design and Applications
  • Hate Speech and Cyberbullying Detection
  • Innovative Human-Technology Interaction
  • Service and Product Innovation
  • Topic Modeling
  • Spam and Phishing Detection
  • Digital Games and Media
  • Media Influence and Politics
  • Sentiment Analysis and Opinion Mining
  • Technology Use by Older Adults
  • Digital Marketing and Social Media
  • Advanced Malware Detection Techniques
  • Human Mobility and Location-Based Analysis
  • Media Studies and Communication
  • Computational and Text Analysis Methods
  • Natural Language Processing Techniques
  • Academic Publishing and Open Access
  • Software Engineering Research
  • Recommender Systems and Techniques
  • Fluid Dynamics and Turbulent Flows
  • Social Media in Health Education

Indiana University Bloomington
2023-2025

Singapore Management University
2020-2023

Tokyo Institute of Technology
2023

Indiana University
2023

Telefonica Research and Development
2013-2021

Hamad bin Khalifa University
2016-2020

Qatar Airways (Qatar)
2015-2020

Association for Computing Machinery
2020

IT University of Copenhagen
2019

Qatar Cardiovascular Research Center
2016-2017

Twitter, a microblogging service less than three years old, commands more 41 million users as of July 2009 and is growing fast. Twitter tweet about any topic within the 140-character limit follow others to receive their tweets. The goal this paper study topological characteristics its power new medium information sharing.

10.1145/1772690.1772751 article EN 2010-04-26

User Generated Content (UGC) is re-shaping the way people watch video and TV, with millions of producers consumers. In particular, UGC sites are creating new viewing patterns social interactions, empowering users to be more creative, developing business opportunities. To better understand impact systems, we have analyzed YouTube, world's largest VoD system. Based on a large amount data collected, provide an in-depth study YouTube other similar systems. popularity life-cycle videos, intrinsic...

10.1145/1298306.1298309 article EN 2007-10-24

Social networking services are a fast-growing business in the Internet. However, it is unknown if online relationships and their growth patterns same as real-life social networks. In this paper, we compare structures of three services: Cyworld, MySpace, orkut, each with more than 10 million users, respectively. We have access to complete data Cyworld's ilchon (friend) analyze its degree distribution, clustering property, correlation, evolution over time. also use Cyworld evaluate validity...

10.1145/1242572.1242685 article EN 2007-05-08

User generated content (UGC), now with millions of video producers and consumers, is reshaping the way people watch TV. In particular, UGC sites are creating new viewing patterns social interactions, empowering users to be more creative, generating business opportunities. Compared traditional video-on-demand (VoD) systems, services allow request videos from a potentially unlimited selection in an asynchronous fashion. To better understand impact services, we have analyzed world's largest VoD...

10.1109/tnet.2008.2011358 article EN IEEE/ACM Transactions on Networking 2009-03-19

In this work we explore cyberbullying and other toxic behavior in team competition online games. Using a dataset of over 10 million player reports on 1.46 players along with corresponding crowdsourced decisions, test several hypotheses drawn from theories explaining behavior. Besides providing large-scale, empirical based understanding behavior, our can be used as basis for building systems to detect, prevent, counter-act

10.1145/2702123.2702529 preprint EN 2015-04-17

Over the past few years, a number of new "fringe" communities, like 4chan or certain subreddits, have gained traction on Web at rapid pace. However, more often than not, little is known about how they evolve what kind activities attract, despite recent research has shown that influence false information reaches mainstream communities. This motivates need to monitor these communities and analyze their impact Web's ecosystem. In August 2016, social network called Gab was created as an...

10.1145/3184558.3191531 preprint EN 2018-01-01

Recent studies have alarmed that many online hate speeches are implicit. With its subtle nature, the explainability of detection such hateful speech has been a challenging problem. In this work, we examine whether ChatGPT can be used for providing natural language explanations (NLEs) implicit detection. We design our prompt to elicit concise ChatGPT-generated NLEs and conduct user evaluate their qualities by comparison with human-written NLEs. discuss potential limitations in context research.

10.1145/3543873.3587368 preprint EN 2023-04-28

Abstract The emergence of generative AI has sparked substantial discussions, with the potential to have profound impacts on society in all aspects. As emerging technologies continue advance, it is imperative facilitate their proper integration into society, managing expectations and fear. This paper investigates users’ perceptions using 3M posts Twitter from January 2019 March 2023, especially focusing occupation usage. We find that people across various occupations, not just IT-related...

10.1140/epjds/s13688-023-00445-y article EN cc-by EPJ Data Science 2024-01-08

Online social networking services are among the most popular Internet according to Alexa.com and have become a key feature in many services. Users interact through various features of online services: making friend relationships, sharing their photos, writing comments. These relationships expected other web services, such as recommendation engines, security measures, search, personalization issues. However, we very limited knowledge on how much interaction actually takes place over declared...

10.1145/1452520.1452528 article EN 2008-10-20

Mobile instant messaging (e.g., via SMS or WhatsApp) often goes along with an expectation of high attentiveness, i.e., that the receiver will notice and read message within a few minutes. Hence, existing services for mobile phones share indicators availability, such as last time user has been online. However, in this paper we not only provide evidence these cues create social pressure, but they are also weak predictors attentiveness. As remedy, propose to machine-computed prediction whether...

10.1145/2556288.2556973 article EN 2014-04-26

Online social media platforms generally attempt to mitigate hateful expressions, as these comments can be detrimental the health of community. However, automatically identifying challenging. We manually label 5,143 expressions posted YouTube and Facebook videos among a dataset 137,098 from an online news media. then create granular taxonomy different types targets hate train machine learning models detect classify in full dataset. Our contribution is twofold: 1) creating for that includes...

10.1609/icwsm.v12i1.15028 article EN Proceedings of the International AAAI Conference on Web and Social Media 2018-06-15

We analyze the dynamics of behavior known as 'unfollow' in Twitter. collected daily snapshots online relationships 1.2 million Korean-speaking users for 51 days well all their tweets. found that Twitter frequently unfollow. then discover major factors, including reciprocity relationships, duration a relationship, followees' informativeness, and overlap which affect decision to conduct interview with 22 Korean respondents supplement quantitative results.

10.1145/1978942.1979104 article EN 2011-05-07

Nearest-neighbor collaborative filtering provides a successful means of generating recommendations for web users. However, this approach suffers from several shortcomings, including data sparsity and noise, the cold-start problem, scalability. In work, we present novel method recommending items to users based on expert opinions. Our is variation traditional filtering: rather than applying nearest neighbor algorithm user-rating data, predictions are computed using set neighbors an independent...

10.1145/1571941.1572033 article EN 2009-07-19

The Middle East respiratory syndrome coronavirus (MERS-CoV) was exported to Korea in 2015, resulting a threat neighboring nations. We evaluated the possibility of using digital surveillance system based on web searches and social media data monitor this MERS outbreak. collected number daily laboratory-confirmed cases quarantined from May 11, 2015 June 26, Korean government portal. trends observed via Google search Twitter during same time period were also ascertained Trends Topsy....

10.1038/srep32920 article EN cc-by Scientific Reports 2016-09-06

Twitter offers an explicit mechanism to facilitate information diffusion and has emerged as a new medium for communication. Many approaches find influentials have been proposed, but they do not consider the temporal order of adoption. In this work, we propose novel method by considering both link structure adoption in Twitter. Our finds distinct who are discovered other methods.

10.1145/1772690.1772842 article EN 2010-04-26

One problem facing players of competitive games is negative, or toxic, behavior. League Legends, the largest eSport game, uses a crowdsourcing platform called Tribunal to judge whether reported toxic player should be punished not. The two stage system requiring reports from those that directly observe behavior, and human experts review aggregated reports. While this has successfully dealt with vague nature behavior by majority rules based on many votes, it naturally requires tremendous cost,...

10.1145/2566486.2567987 article EN 2014-04-07

A growing number of people are changing the way they consume news, replacing traditional physical newspapers and magazines by their virtual online versions or/and weblogs. The interactivity immediacy present in news being produced exposed media corporations. News websites have to create effective strategies catch people's attention attract clicks. In this paper we investigate possible used corporations design headlines. We analyze content 69,907 headlines four major global during a minimum...

10.48550/arxiv.1503.07921 preprint EN other-oa arXiv (Cornell University) 2015-01-01

In this research, we conceptually examine the use of personas in an age large-scale online analytics data. Based on criticism and benefits outlined prior work by practitioners working with data, formulate major arguments for against given real-time data about customers, analyze these arguments, demonstrate areas productive employment data-driven leveraging their creation. Our key tenet is that are located between aggregated individual customer statistics. At best, digital capture coverage...

10.21153/psj2018vol4no2art737 article EN cc-by-nc Persona Studies 2018-11-05

We develop a methodology to automate creating imaginary people, referred as personas, by processing complex behavioral and demographic data of social media audiences. From popular account containing more than 30 million interactions viewers from 198 countries engaging with 4,200 online videos produced global corporation, we demonstrate that our has several novel accomplishments, including: (a) identifying distinct user segments based on the content consumption patterns; (b) impactful...

10.1145/3265986 article EN ACM Transactions on the Web 2018-11-01

A growing number of people are changing the way they consume news, replacing traditional physical newspapers and magazines by their virtual online versions or/and weblogs. The interactivity immediacy present in news being produced exposed media corporations. News websites have to create effective strategies catch people’s attention attract clicks. In this paper we investigate possible used corporations design headlines. We analyze content 69,907 headlines four major global during a minimum...

10.1609/icwsm.v9i1.14619 article EN Proceedings of the International AAAI Conference on Web and Social Media 2021-08-03

ChatGPT, the first large language model with mass adoption, has demonstrated remarkableperformance in numerous natural tasks. Despite its evident usefulness, evaluatingChatGPT's performance diverse problem domains remains challenging due to closednature of and continuous updates via Reinforcement Learning from HumanFeedback (RLHF). We highlight issue data contamination ChatGPT evaluations, a case study stance detection. discuss challenge preventing ensuring fair evaluation age closed...

10.18653/v1/2023.trustnlp-1.5 article EN cc-by 2023-01-01

We develop a methodology for persona generation using real time social media data the distribution of products via online platforms. From large account containing more than 30 million interactions from users 181 countries engaging with 4,200 digital produced by global corporation, we demonstrate that our can first identify both distinct and impactful user segments then create descriptions automatically adding pertinent features, such as names, photos, personal attributes. validate approach...

10.1145/3027063.3053120 article EN 2017-05-01

In this research, we evaluate four widely used face detection tools, which are Face++, IBM Bluemix Visual Recognition, AWS Rekognition, and Microsoft Azure Face API, using multiple datasets to determine their accuracy in inferring user attributes, including gender, race, age. Results show that the tools generally proficient at determining with rates greater than 90%, except for Bluemix. Concerning only one of provides capability, an rate although evaluation was performed on a high-quality...

10.1609/icwsm.v12i1.15058 article EN Proceedings of the International AAAI Conference on Web and Social Media 2018-06-15

Online platforms, such as Facebook, Twitter, and Reddit, provide users with a rich set of features for sharing consuming political information, expressing opinions, exchanging potentially contrary views. In activities, two types communication spaces naturally emerge: those dominated by exchanges between politically homogeneous that allow encourage crosscutting in heterogeneous groups. While research on talk online environments abounds, we know surprisingly little about the varying nature...

10.1609/icwsm.v13i01.3210 article EN Proceedings of the International AAAI Conference on Web and Social Media 2019-07-06
Coming Soon ...