- Hate Speech and Cyberbullying Detection
- Topic Modeling
- Media Influence and Politics
- Natural Language Processing Techniques
- Speech Recognition and Synthesis
- Computational and Text Analysis Methods
- Misinformation and Its Impacts
Czech Technical University in Prague
2023
Although media bias detection is a complex multi-task problem, there is, to date, no unified benchmark grouping these evaluation tasks. We introduce the Media Bias Identification Benchmark (MBIB), comprehensive that groups different types of (e.g., linguistic, cognitive, political) under common framework test how prospective techniques generalize. After reviewing 115 datasets, we select nine tasks and carefully propose 22 associated datasets for evaluating techniques. evaluate MBIB using...
Media bias detection poses a complex, multifaceted problem traditionally tackled using single-task models and small in-domain datasets, consequently lacking generalizability. To address this, we introduce MAGPIE, the first large-scale multi-task pre-training approach explicitly tailored for media detection. enable at scale, present Large Bias Mixture (LBM), compilation of 59 bias-related tasks. MAGPIE outperforms previous approaches in on Annotation By Experts (BABE) dataset, with relative...
High annotation costs from hiring or crowdsourcing complicate the creation of large, high-quality datasets needed for training reliable text classifiers. Recent research suggests using Large Language Models (LLMs) to automate process, reducing these while maintaining data quality. LLMs have shown promising results in annotating downstream tasks like hate speech detection and political framing. Building on success areas, this study investigates whether are viable complex task media bias a...