- Advanced Causal Inference Techniques
- Statistical Methods and Inference
- Statistical Methods in Clinical Trials
- Social Media and Politics
- Advanced Bandit Algorithms Research
- Statistical Methods and Bayesian Inference
- Machine Learning and Data Classification
- Misinformation and Its Impacts
- Hate Speech and Cyberbullying Detection
- Explainable Artificial Intelligence (XAI)
- Media Influence and Politics
- Data Stream Mining Techniques
- Bayesian Modeling and Causal Inference
- Smart Grid Energy Management
- Advanced Multi-Objective Optimization Algorithms
- Distributed Sensor Networks and Detection Algorithms
- Machine Learning in Healthcare
- Hydrology and Watershed Management Studies
- Image and Video Quality Assessment
- Video Coding and Compression Technologies
- AI-based Problem Solving and Planning
- Aesthetic Perception and Analysis
- Privacy-Preserving Technologies in Data
- Media Studies and Communication
- Reinforcement Learning in Robotics
Menlo School
2019-2024
University of Vienna
2019-2024
Hertie School
2024
Meta (Israel)
2020
Meta (United States)
2019
New York University
2016
We investigated the effects of Facebook’s and Instagram’s feed algorithms during 2020 US election. assigned a sample consenting users to reverse-chronologically-ordered feeds instead default algorithms. Moving out algorithmic substantially decreased time they spent on platforms their activity. The chronological also affected exposure content: amount political untrustworthy content saw increased both platforms, classified as uncivil or containing slur words Facebook, from moderate friends...
Abstract Many critics raise concerns about the prevalence of ‘echo chambers’ on social media and their potential role in increasing political polarization. However, lack available data challenges conducting large-scale field experiments have made it difficult to assess scope problem 1,2 . Here we present from 2020 for entire population active adult Facebook users USA showing that content ‘like-minded’ sources constitutes majority what people see platform, although information news represent...
We studied the effects of exposure to reshared content on Facebook during 2020 US election by assigning a random set consenting, US-based users feeds that did not contain any reshares over 3-month period. find removing substantially decreases amount political news, including from untrustworthy sources, which are exposed; overall clicks and reactions; reduces partisan news clicks. Further, we observe produces clear in knowledge within sample, although there is some uncertainty about how this...
We study the effect of Facebook and Instagram access on political beliefs, attitudes, behavior by randomizing a subset 19,857 users 15,585 to deactivate their accounts for 6 wk before 2020 U.S. election. report four key findings. First, both deactivation reduced an index participation (driven mainly online). Second, had no significant knowledge, but secondary analyses suggest that it knowledge general news while possibly also decreasing belief in misinformation circulating online. Third, may...
Client-side video players employ adaptive bitrate (ABR) algorithms to optimize user quality of experience (QoE). We evaluate recently proposed RL-based ABR methods in Facebook's web-based streaming platform. Real-world contains several challenges that requires customized designs beyond off-the-shelf RL -- we implement a scalable neural network architecture supports videos with arbitrary encodings; design training method cope the variance resulting from stochasticity conditions; and leverage...
We respond to Aronow et al. (2025)'s paper arguing that randomized controlled trials (RCTs) are "enough," while nonparametric identification in observational studies is not. agree with their position respect experimental versus research, but question what it would mean extend this logic the scientific enterprise more broadly. first investigate meant by fundamentally a sociological claim about relationship between statistical work and larger social institutional processes, rather than...
Abstract: We respond to Aronow et al. (2025)’s paper arguing that randomized controlled trials (RCTs) are “enough,” while nonparametric identification in observational studies is not. agree with their position respect experimental versus research, but question what it would mean extend this logic the scientific enterprise more broadly. first investigate meant by a fundamentally sociological claim about relationship between statistical work and larger social institutional processes, rather...
Machine learning is commonly used to estimate the heterogeneous treatment effects (HTEs) in randomized experiments. Using large-scale experiments on Facebook and Criteo platforms, we observe substantial discrepancies between machine learning-based effect estimates difference-in-means directly from experiment. This paper provides a two-step framework for practitioners researchers diagnose rectify this discrepancy. We first introduce diagnostic tool assess whether bias exists model-based...
We develop and analyze empirical Bayes Stein-type estimators for use in the estimation of causal effects large-scale online experiments. While experiments are generally thought to be distinguished by their large sample size, we focus on multiplicity treatment groups. The typical analysis practice is simple differences-in-means (perhaps with covariate adjustment) as if all arms were independent. In this work consistent, small bias, shrinkage setting. addition achieving lower mean squared...
Recent advances in contextual bandit optimization and reinforcement learning have garnered interest applying these methods to real-world sequential decision making problems. Real-world applications frequently constraints with respect a currently deployed policy. Many of the existing constraint-aware algorithms consider problems single objective (the reward) constraint on reward baseline However, many important involve multiple competing objectives auxiliary constraints. In this paper, we...
The past decade has seen an increase in public attention on the role of campaign donations and outside spending. This led some donors to seek ways skirting disclosure requirements, such as by contributing through nonprofits that allow for greater privacy. These nonetheless clearly aim influence policy discussions have a direct impact, cases, electoral outcomes. We develop technique identifying engaged political activity relies not their formal disclosure, which is often understated or...
Causal generalization is essential to contemporary political science practice. We argue that recent methodological advances in causal pay insufficient attention issues which arise from over time. For assumptions of varying degrees strictness, we derive novel statistical bounds the growing uncertainty a given estimate into future. these using Wasserstein divergence allows us weaken positivity are not typically met In an empirical example, demonstrate actual variation treatment effects time...
Using machine learning to estimate the heterogeneity of treatment effects (HTE) in randomized experiments is common when firms and digital platforms seek understand how individuals differ their responses a policy. However, will average effect from HTE model align with simple subgroup estimates random experiments? large-scale experiment on Facebook, we observe substantial discrepancy between learning-based difference-in-means estimator same experiment. We propose use quantile-quantile plot...
Estimation of importance sampling weights for off-policy evaluation contextual bandits often results in imbalance - a mismatch between the desired and actual distribution state-action pairs after weighting. In this work we present balanced (B-OPE), generic method estimating which minimize imbalance. these reduces to binary classification problem regardless action type. We show that minimizing risk classifier implies minimization counterfactual pairs. The loss is tied error estimate, allowing...
In this work, we reframe the problem of balanced treatment assignment as optimization a two-sample test between and control units. Using lens provide an algorithm that is optimal with respect to minimum spanning tree Friedman Rafsky (1979). This groups may be performed exactly in polynomial time. We probabilistic interpretation process terms most probable element designs drawn from determinantal point which admits design. novel formulation estimation transductive inference show how...
Black-box heterogeneous treatment effect (HTE) models are increasingly being used to create personalized policies that assign individuals their optimal treatments. However, they difficult understand, and can be burdensome maintain in a production environment. In this paper, we present scalable, interpretable experimentation system, implemented deployed at Meta. The system works multiple treatment, outcome setting typical Meta to: (1) learn explanations for black-box HTE models; (2) generate...
In observational causal inference, in order to emulate a randomized experiment, weights are used render treatments independent of observed covariates. This property is known as balance; its absence, estimated effects may be arbitrarily biased. this work we introduce permutation weighting, method for estimating balancing using standard binary classifier (regardless cardinality treatment). A large class probabilistic classifiers method; the choice loss implies particular definition balance. We...