- Opinion Dynamics and Social Influence
- Complex Network Analysis Techniques
- Social Media and Politics
- Misinformation and Its Impacts
- Digital Marketing and Social Media
- Sports Analytics and Performance
- Topic Modeling
- Wikis in Education and Collaboration
- Natural Language Processing Techniques
- Media Influence and Politics
- Artificial Intelligence in Games
- Open Source Software Innovations
- Hate Speech and Cyberbullying Detection
- Recommender Systems and Techniques
- Reinforcement Learning in Robotics
- Decision-Making and Behavioral Economics
- Online Learning and Analytics
- Big Data and Business Intelligence
- Digital Games and Media
- Impact of Technology on Adolescents
- Sentiment Analysis and Opinion Mining
- Digital Mental Health Interventions
- Technology Adoption and User Behaviour
- Experimental Behavioral Economics Studies
- Advanced Graph Neural Networks
University of Toronto
2018-2025
Thornton Tomasetti (United States)
2024
Microsoft (United States)
2016-2018
Microsoft Research New York City (United States)
2017-2018
Microsoft Research (United Kingdom)
2017
Shelby County Schools
2016
Stanford University
2010-2015
Georgia Southern University
2015
Viral products and ideas are intuitively understood to grow through a person-to-person diffusion process analogous the spread of an infectious disease; however, until recently it has been prohibitively difficult directly observe purportedly viral events, thus rigorously quantify or characterize their structural properties. Here we propose formal measure what label “structural virality” that interpolates between two conceptual extremes: content gains its popularity single, large broadcast...
The Web has enabled one of the most visible recent developments in education---the deployment massive open online courses. With their global reach and often staggering enrollments, MOOCs have potential to become a major new mechanism for learning. Despite this early promise, however, are still relatively unexplored poorly understood.
An increasingly common feature of online communities and social media sites is a mechanism for rewarding user achievements based on system badges. Badges are given to users particular contributions site, such as performing certain number actions type. They have been employed in many domains, including news like the Huffington Post, educational Khan Academy, knowledge-creation Wikipedia Stack Overflow. At most basic level, badges serve summary user's key accomplishments; however, experience...
Question answering (Q&A) websites are now large repositories of valuable knowledge. While most Q&A sites were initially aimed at providing useful answers to the question asker, there has been a marked shift towards as community-driven knowledge creation process whose end product can be enduring value broad audience. As part this shift, specific expertise and deep subject hand have become increasingly important, many employ voting reputation mechanisms centerpieces their design help users...
On many online platforms, users can engage with millions of pieces content, which they discover either organically or through algorithmically-generated recommendations. While the short-term benefits recommender systems are well-known, their long-term impacts less well understood. In this work, we study user experience on Spotify, a popular music streaming service, lens diversity—the coherence set songs listens to. We use high-fidelity embedding based listening behavior Spotify to quantify...
There are many settings in which users of a social media application provide evaluations one another. In variety domains, mechanisms for evaluation allow user to say whether he or she trusts another user, likes the content they produced, wants confer special levels authority responsibility on them. Earlier work has studied how relative status between two - that is, their comparative group affects types gives
We study the patterns by which a user consumes same item repeatedly over time, in wide variety domains ranging from check-ins at business location to re-watches of video. find that recency consumption is strongest predictor repeat consumption. Based on this, we develop model $t$ timesteps ago reconsumed with probability proportional function t. theoretical properties this model, algorithms learn reconsumption likelihood as t, and show strong fit resulting inferred via power law exponential...
How predictable is success in complex social systems? In spite of a recent profusion prediction studies that exploit online and information network data, this question remains unanswered, part because it has not been adequately specified. paper we attempt to clarify the by presenting simple stylized model attributes error one two generic sources: insufficiency available data and/or models on hand; inherent unpredictability systems other. We then use motivate an illustrative empirical study...
The power of machine learning systems not only promises great technical progress, but risks societal harm. As a recent example, researchers have shown that popular word embedding algorithms exhibit stereotypical biases, such as gender bias. widespread use these in systems, from automated translation services to curriculum vitae scanners, can amplify stereotypes important contexts. Although methods been developed measure biases and alter embeddings mitigate their biased representations, there...
Many of the world's most popular websites catalyze their growth through invitations from existing members. New members can then in turn issue invitations, and so on, creating cascades member signups that spread on a global scale. Although these diffusive invitation processes are critical to popularity many websites, they have rarely been studied, properties remain elusive. For instance, it is not known how viral structures are, grow over time, or affects resulting distribution...
In many online platforms, people must choose how broadly to allocate their energy. Should one concentrate on a narrow area of focus, and become specialist, or apply oneself more broadly, generalist? this work, we propose principled measure generalist specialist user is, study behavior in platforms through lens. To do this, construct highly accurate community embeddings that represent communities high-dimensional space. We develop sets analogies use them optimize our so they encode...
As artificial intelligence becomes increasingly intelligent---in some cases, achieving superhuman performance---there is growing potential for humans to learn from and collaborate with algorithms. However, the ways in which AI systems approach problems are often different people do, thus may be uninterpretable hard from. A crucial step bridging this gap between human modeling granular actions that constitute behavior, rather than simply matching aggregate performance. We pursue goal a model...
What explains the relative persistence of same-race romantic relationships? One possible explanation is structural--this phenomenon could reflect fact that social interactions are already stratified along racial lines--while another attributes these patterns to individual-level preferences. We present novel evidence from an online dating community involving more than 250,000 people in United States about frequency with which individuals both express a preference for partners and act choose partners.
It is often said that constraints affect creative production, both in terms of form and quality. Online social media platforms frequently impose on the content users can produce, limiting range possible contributions. Do these restrictions tend to push creators towards producing more or less successful content? How do adapt their contributions fit limits imposed by platforms? To answer questions, we conduct an observational study a recent event: November 7, 2017, Twitter changed maximum...
Many online services, such as search engines, social media platforms, and digital marketplaces, are advertised being available to any user, regardless of their age, gender, or other demographic factors. However, there growing concerns that these services may systematically underserve some groups users. In this paper, we present a framework for internally auditing differences in user satisfaction across groups, using engines case study. We first explain the pitfalls naively comparing...
Gab, an online social media platform with very little content moderation, has recently come to prominence as alt-right community and a haven for hate speech. We document the evolution of Gab since its inception until user carried out most deadly attack on Jewish in US history. investigate language use, study how topics evolved over time, find that shooters’ posts were among consistently anti-Semitic but hundreds other users even more extreme.
Online recommendation systems are prone to create filter bubbles, whereby users only recommended content narrowly aligned with their historical interests. In the case of media recommendation, this can reinforce political polarization by recommending topical (e.g., on economy) at one extreme end spectrum even though topic has broad coverage from multiple viewpoints that would provide a more balanced and informed perspective for user. Historically, Maximal Marginal Relevance (MMR) been used...
What makes written text appealing? In this registered report, we study the linguistic characteristics of news headline success using a large-scale dataset field experiments (A/B tests) conducted on popular website Upworthy.com comparing multiple variants for same articles. This unique setup allows us to control factors that could otherwise have important confounding effects success. Based prior literature and an exploratory portion data, formulated hypotheses about features associated with...
Large Language Models (LLMs) have democratized synthetic data generation, which in turn has the potential to simplify and broaden a wide gamut of NLP tasks. Here, we tackle pervasive problem generation: its generative distribution often differs from real-world researchers care about (in other words, it is unfaithful). In case study on sarcasm detection, three strategies increase faithfulness data: grounding, filtering, taxonomy-based generation. We evaluate these using performance...