NFDI4DS | UHH-SEMS - Publication Details

Benjamin Roth

ORCID: 0000-0003-0362-0267

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5046895021

Research Areas

Topic Modeling
Natural Language Processing Techniques
Advanced Text Analysis Techniques
Machine Learning and Data Classification
Text and Document Classification Technologies
Explainable Artificial Intelligence (XAI)
Sentiment Analysis and Opinion Mining
Semantic Web and Ontologies
Speech and dialogue systems
Adversarial Robustness in Machine Learning
Multimodal Machine Learning Applications
Anomaly Detection Techniques and Applications
Biomedical Text Mining and Ontologies
Text Readability and Simplification
Advanced Graph Neural Networks
Data Quality and Management
Bayesian Modeling and Causal Inference
Software Engineering Research
Hate Speech and Cyberbullying Detection
Machine Learning in Healthcare
Domain Adaptation and Few-Shot Learning
Image Retrieval and Classification Techniques
Misinformation and Its Impacts
Scientific Computing and Data Management
Generative Adversarial Networks and Image Synthesis

Heidelberg University
2017-2025

University of Vienna
2021-2024

Friedrich-Alexander-Universität Erlangen-Nürnberg
2024

Faculty (United Kingdom)
2023

Ludwig-Maximilians-Universität München
2016-2021

German Cancer Research Center
2017-2020

DKFZ-ZMBH Alliance
2017

University of Massachusetts Amherst
2014-2016

Amherst College
2016

Consol Energy (United States)
2016

Compositional Vector Space Models for Knowledge Base Completion

OPENALEX - Publications

Arvind Neelakantan Benjamin Roth Andrew McCallum

Arvind Neelakantan, Benjamin Roth, Andrew McCallum. Proceedings of the 53rd Annual Meeting Association for Computational Linguistics and 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2015.

10.3115/v1/p15-1016 article EN 2015-01-01

Multilingual Relation Extraction using Compositional Universal Schema

OPENALEX - Publications

Patrick Verga David Belanger Emma Strubell Benjamin Roth Andrew McCallum

Patrick Verga, David Belanger, Emma Strubell, Benjamin Roth, Andrew McCallum. Proceedings of the 2016 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies. 2016.

10.18653/v1/n16-1103 article EN cc-by Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2016-01-01

Joint Aspect and Polarity Classification for Aspect-based Sentiment Analysis with End-to-End Neural Networks

OPENALEX - Publications

Martin Schmitt Simon Steinheber Konrad Schreiber Benjamin Roth

In this work, we propose a new model for aspect-based sentiment analysis. contrast to previous approaches, jointly the detection of aspects and classification their polarity in an end-to-end trainable neural network. We conduct experiments with different architectures word representations on recent GermEval 2017 dataset. were able show considerable performance gains by using joint modeling approach all settings compared pipeline approaches. The combination convolutional network fasttext...

10.18653/v1/d18-1139 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2018-01-01

Evaluating neural network explanation methods using hybrid documents and morphosyntactic agreement

OPENALEX - Publications

Nina Poerner Hinrich Schütze Benjamin Roth

The behavior of deep neural networks (DNNs) is hard to understand. This makes it necessary explore post hoc explanation methods. We conduct the first comprehensive evaluation methods for NLP. To this end, we design two novel paradigms that cover important classes NLP problems: small context and large problems. Both require no manual annotation are therefore broadly applicable. also introduce LIMSSE, an method inspired by LIME designed show empirically LRP DeepLIFT most effective recommend...

10.18653/v1/p18-1032 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2018-01-01

A survey of noise reduction methods for distant supervision

OPENALEX - Publications

Benjamin Roth Tassilo Barth Michael Wiegand Dietrich Klakow

We survey recent approaches to noise reduction in distant supervision learning for relation extraction. group them according the principles they are based on: at-least-one constraints, topic-based models, or pattern correlations. Besides describing them, we illustrate fundamental differences and attempt give an outlook potentially fruitful further research. In addition, identify related work sentiment analysis which could profit from reduction.

10.1145/2509558.2509571 article EN 2013-10-27

Influences on LLM Calibration: A Study of Response Agreement, Loss Functions, and Prompt Styles

OPENALEX - Publications

Ye Xia Pedro Araujo Klim Zaporojets Benjamin Roth

Calibration, the alignment between model confidence and prediction accuracy, is critical for reliable deployment of large language models (LLMs). Existing works neglect to measure generalization their methods other prompt styles different sizes LLMs. To address this, we define a controlled experimental setting covering 12 LLMs four styles. We additionally investigate if incorporating response agreement multiple an appropriate loss function can improve calibration performance. Concretely,...

10.48550/arxiv.2501.03991 preprint EN arXiv (Cornell University) 2025-01-07

Knowledge Connector: Decision support system for multiomics-based precision oncology

OPENALEX - Publications

Daniel Hübschmann Simon Kreutzfeldt Benjamin Roth Katrin Glocker Janine Schoop and 19 more

Abstract Precision cancer medicine aims to improve patient outcomes by providing individually tailored recommendations for clinical management based on the evaluation of biological disease profiles in multidisciplinary molecular tumor boards (MTBs). The quality MTB decisions depends comprehensive, reliable, and reproducible interpretation increasingly complex data. We developed implemented, as part a multicenter precision oncology program, Knowledge Connector (KC), decision support system...

10.1101/2025.02.23.25322403 preprint EN cc-by-nc medRxiv (Cold Spring Harbor Laboratory) 2025-02-25

Comparing Convolutional Neural Networks to Traditional Models for Slot Filling

OPENALEX - Publications

Heike Adel Benjamin Roth Hinrich Schütze

We address relation classification in the context of slot filling, task finding and evaluating fillers like "Steve Jobs" for X "X founded Apple".We propose a convolutional neural network which splits input sentence into three parts according to arguments compare it state-ofthe-art traditional approaches classification.Finally, we combine different methods show that combination is better than individual approaches.We also analyze effect genre differences on performance.

10.18653/v1/n16-1097 article EN cc-by Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2016-01-01

Combining Generative and Discriminative Model Scores for Distant Supervision

OPENALEX - Publications

Benjamin Roth Dietrich Klakow

Distant supervision is a scheme to generate noisy training data for relation extraction by aligning entities of knowledge base with text. In this work we combine the output discriminative at-least-one learner that generative hierarchical topic model reduce noise in distant data. The combination significantly increases ranking quality extracted facts and achieves state-of-the-art performance an end-to-end setting. A simple linear interpolation scores performs better than parameter-free based...

10.18653/v1/d13-1003 article EN Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2013-01-01

Effective Slot Filling Based on Shallow Distant Supervision Methods

OPENALEX - Publications

Benjamin Roth Tassilo Barth Michael Wiegand Mittul Singh Dietrich Klakow

Spoken Language Systems at Saarland University (LSV) participated this year with 5 runs the TAC KBP English slot filling track. Effective algorithms for all parts of pipeline, from document retrieval to relation prediction and response post-processing, are bundled in a modular end-to-end extraction system called RelationFactory. The main run solely focuses on shallow techniques achieved significant improvements over LSV's last year's system, while using same training data patterns....

10.48550/arxiv.1401.1158 preprint EN other-oa arXiv (Cornell University) 2014-01-01

Interpretable Question Answering on Knowledge Bases and Text

OPENALEX - Publications

Alona Sydorova Nina Poerner Benjamin Roth

Interpretability of machine learning (ML) models becomes more relevant with their increasing adoption. In this work, we address the interpretability ML based question answering (QA) on a combination knowledge bases (KB) and text documents. We adapt post hoc explanation methods such as LIME input perturbation (IP) compare them self-explanatory attention mechanism model. For purpose, propose an automatic evaluation paradigm for in context QA. also conduct study human annotators to evaluate...

10.18653/v1/p19-1488 article EN cc-by 2019-01-01

Position-aware Self-attention with Relative Positional Encodings for Slot Filling

OPENALEX - Publications

Ivan Bilan Benjamin Roth

This paper describes how to apply self-attention with relative positional encodings the task of relation extraction. We propose use encoder layer together an additional position-aware attention that takes into account positions query and object in sentence. The also uses a custom implementation which allow each word sentence take its left right context. evaluation model is done on TACRED dataset. proposed relies only (no recurrent or convolutional layers are used), while improving...

10.48550/arxiv.1807.03052 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Interpretable Textual Neuron Representations for NLP

OPENALEX - Publications

Nina Poerner Benjamin Roth Hinrich Schütze

Input optimization methods, such as Google Deep Dream, create interpretable representations of neurons for computer vision DNNs. We propose and evaluate ways transferring this technology to NLP. Our results suggest that gradient ascent with a gumbel softmax layer produces n-gram outperform naive corpus search in terms target neuron activation. The highlight differences syntax awareness between the language visual models Imaginet architecture.

10.18653/v1/w18-5437 article EN cc-by 2018-01-01

RelationFactory: A Fast, Modular and Effective System for Knowledge Base Population

OPENALEX - Publications

Benjamin Roth Tassilo Barth Grzegorz Chrupała Martin Gropp Dietrich Klakow

Benjamin Roth, Tassilo Barth, Grzegorz Chrupała, Martin Gropp, Dietrich Klakow. Proceedings of the Demonstrations at 14th Conference European Chapter Association for Computational Linguistics. 2014.

10.3115/v1/e14-2023 article EN cc-by 2014-01-01

Joint Bootstrapping Machines for High Confidence Relation Extraction

OPENALEX - Publications

Pankaj Gupta Benjamin Roth Hinrich Schütze

Pankaj Gupta, Benjamin Roth, Hinrich Schütze. Proceedings of the 2018 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 2018.

10.18653/v1/n18-1003 article EN cc-by 2018-01-01

Feature-based models for improving the quality of noisy training data for relation extraction

OPENALEX - Publications

Benjamin Roth Dietrich Klakow

Supervised relation extraction from text relies on annotated data. Distant supervision is a scheme to obtain noisy training data by using knowledge base of relational tuples as the ground truth and finding entity pair matches in corpus. We propose evaluate two feature-based models for increasing quality distant patterns.

10.1145/2505515.2507850 article EN 2013-01-01

SepLL: Separating Latent Class Labels from Weak Supervision Noise

OPENALEX - Publications

Andreas Stephan Vasiliki Kougia Benjamin Roth

In the weakly supervised learning paradigm, labeling functions automatically assign heuristic, often noisy, labels to data samples. this work, we provide a method for from weak by separating two types of complementary information associated with functions: related target label and specific one function only. Both are reflected different degrees all labeled instances. contrast previous works that aimed at correcting or removing wrongly instances, learn branched deep model uses as-is, but...

10.18653/v1/2022.findings-emnlp.288 article EN cc-by 2022-01-01

Text-Guided Image Clustering

OPENALEX - Publications

Andreas Stephan Lukas Miklautz Kevin Sidak Jan Philip Wahle Béla Gipp and 2 more

Image clustering divides a collection of images into meaningful groups, typically interpreted post-hoc via human-given annotations. Those are usually in the form text, begging question using text as an abstraction for image clustering. Current methods, however, neglect use generated textual descriptions. We, therefore, propose Text-Guided Clustering, i.e., generating captioning and visual question-answering (VQA) models subsequently text. Further, we introduce novel approach to inject task-...

10.48550/arxiv.2402.02996 preprint EN arXiv (Cornell University) 2024-02-05

Coming Soon ...