NFDI4DS | UHH-SEMS - Publication Details

Benjamin Van Durme

ORCID: 0000-0003-4328-4288

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5075825791

Research Areas

Natural Language Processing Techniques
Topic Modeling
Multimodal Machine Learning Applications
Text Readability and Simplification
Semantic Web and Ontologies
Advanced Text Analysis Techniques
Speech and dialogue systems
Speech Recognition and Synthesis
Artificial Intelligence in Law
Software Engineering Research
Advanced Image and Video Retrieval Techniques
Domain Adaptation and Few-Shot Learning
Biomedical Text Mining and Ontologies
Authorship Attribution and Profiling
Algorithms and Data Compression
Web Data Mining and Analysis
Spam and Phishing Detection
Text and Document Classification Technologies
Explainable Artificial Intelligence (XAI)
Data Quality and Management
Sentiment Analysis and Opinion Mining
Legal Education and Practice Innovations
Multi-Agent Systems and Negotiation
Music and Audio Processing
Intelligent Tutoring Systems and Adaptive Learning

Johns Hopkins University
2015-2024

Microsoft Research (United Kingdom)
2024

IT University of Copenhagen
2023

Tokyo Institute of Technology
2023

Administration for Community Living
2023

Bryn Mawr College
2023

American Jewish Committee
2023

Stony Brook University
2023

University of North Carolina Health Care
2023

University of North Carolina at Chapel Hill
2023

Hypothesis Only Baselines in Natural Language Inference

OPENALEX - Publications

Adam Poliak Jason Naradowsky Aparajita Haldar Rachel Rudinger Benjamin Van Durme

We propose a hypothesis only baseline for diagnosing Natural Language Inference (NLI). Especially when an NLI dataset assumes inference is occurring based purely on the relationship between context and hypothesis, it follows that assessing entailment relations while ignoring provided degenerate solution. Yet, through experiments 10 distinct datasets, we find this approach, which refer to as hypothesis-only model, able significantly outperform majority-class across number of datasets. Our...

10.18653/v1/s18-2023 article EN cc-by 2018-01-01

Information Extraction over Structured Data: Question Answering with Freebase

OPENALEX - Publications

Xuchen Yao Benjamin Van Durme

Answering natural language questions using the Freebase knowledge base has recently been explored as a platform for advancing state of art in open domain semantic parsing.Those efforts map to sophisticated meaning representations that are then attempted be matched against viable answer candidates base.Here we show relatively modest information extraction techniques, when paired with webscale corpus, can outperform these approaches by roughly 34% relative gain.

10.3115/v1/p14-1090 article EN cc-by 2014-01-01

Gender Bias in Coreference Resolution

OPENALEX - Publications

Rachel Rudinger Jason Naradowsky Brian Leonard Benjamin Van Durme

Rachel Rudinger, Jason Naradowsky, Brian Leonard, Benjamin Van Durme. Proceedings of the 2018 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). 2018.

10.18653/v1/n18-2002 article EN cc-by 2018-01-01

What do you learn from context? Probing for sentence structure in contextualized word representations

OPENALEX - Publications

Ian Tenney Patrick Xia Berlin Chen Alex Wang Adam Poliak and 6 more

Contextualized representation models such as ELMo (Peters et al., 2018a) and BERT (Devlin 2018) have recently achieved state-of-the-art results on a diverse array of downstream NLP tasks. Building recent token-level probing work, we introduce novel edge task design construct broad suite sub-sentence tasks derived from the traditional structured pipeline. We probe word-level contextual representations four investigate how they encode sentence structure across range syntactic, semantic, local,...

10.48550/arxiv.1905.06316 preprint EN other-oa arXiv (Cornell University) 2019-01-01

PPDB 2.0: Better paraphrase ranking, fine-grained entailment relations, word embeddings, and style classification

OPENALEX - Publications

Ellie Pavlick Pushpendre Rastogi Juri Ganitkevitch Benjamin Van Durme Chris Callison-Burch

Ellie Pavlick, Pushpendre Rastogi, Juri Ganitkevitch, Benjamin Van Durme, Chris Callison-Burch. Proceedings of the 53rd Annual Meeting Association for Computational Linguistics and 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). 2015.

10.3115/v1/p15-2070 article EN cc-by 2015-01-01

ReCoRD: Bridging the Gap between Human and Machine Commonsense Reading Comprehension

OPENALEX - Publications

Sheng Zhang Xiaodong Liu Jun Liu Jianfeng Gao Kevin Duh and 1 more

We present a large-scale dataset, ReCoRD, for machine reading comprehension requiring commonsense reasoning. Experiments on this dataset demonstrate that the performance of state-of-the-art MRC systems fall far behind human performance. ReCoRD represents challenge future research to bridge gap between and comprehension. is available at http://nlp.jhu.edu/record.

10.48550/arxiv.1810.12885 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Open Domain Targeted Sentiment

OPENALEX - Publications

Margaret Mitchell Jacqui Aguilar Theresa Wilson Benjamin Van Durme

We propose a novel approach to sentiment analysis for low resource setting. The intuition behind this work is that expressed towards an entity, targeted sentiment, may be viewed as span of across the entity. This representation allows us model detection sequence tagging problem, jointly discovering people and organizations along with whether there directed them. compare performance in both Spanish English on microblog data, using only lexicon external resource. By leveraging...

10.18653/v1/d13-1171 article EN Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2013-01-01

Constrained Language Models Yield Few-Shot Semantic Parsers

OPENALEX - Publications

Richard Shin Christopher Lin Sam Thomson Charles Chen Subhro Roy and 5 more

Richard Shin, Christopher Lin, Sam Thomson, Charles Chen, Subhro Roy, Emmanouil Antonios Platanios, Adam Pauls, Dan Klein, Jason Eisner, Benjamin Van Durme. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 2021.

10.18653/v1/2021.emnlp-main.608 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2021-01-01

Can GPT-3 Perform Statutory Reasoning?

OPENALEX - Publications

Andrew Blair-Stanek Nils Holzenberger Benjamin Van Durme

Statutory reasoning is the task of with facts and statutes, which are rules written in natural language by a legislature. It basic legal skill. In this paper we explore capabilities most capable GPT-3 model, text-davinci-003, on an established statutory-reasoning dataset called SARA. We consider variety approaches, including dynamic few-shot prompting, chain-of-thought zero-shot prompting. While achieve results that better than previous best published results, also identify several types...

10.1145/3594536.3595163 article EN 2023-06-19

Reporting bias and knowledge acquisition

OPENALEX - Publications

Jonathan Gordon Benjamin Van Durme

Much work in knowledge extraction from text tacitly assumes that the frequency with which people write about actions, outcomes, or properties is a reflection of real-world frequencies degree to property characteristic class individuals. In this paper, we question idea, examining phenomenon reporting bias and challenge it poses for extraction. We conclude discussion approaches learning commonsense despite distortion.

10.1145/2509558.2509563 article EN 2013-10-27

Universal Decompositional Semantics on Universal Dependencies

OPENALEX - Publications

Aaron Steven White Drew Reisinger Keisuke Sakaguchi Tim Vieira Sheng Zhang and 3 more

Aaron Steven White, Drew Reisinger, Keisuke Sakaguchi, Tim Vieira, Sheng Zhang, Rachel Rudinger, Kyle Rawlins, Benjamin Van Durme. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016.

10.18653/v1/d16-1177 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2016-01-01

Efficient spoken term discovery using randomized algorithms

OPENALEX - Publications

Aren Jansen Benjamin Van Durme

Spoken term discovery is the task of automatically identifying words and phrases in speech data by searching for long repeated acoustic patterns. Initial solutions relied on exhaustive dynamic time warping-based searches across entire similarity matrix, a method whose scalability ultimately limited O(n <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> ) nature search space. Recent strategies have attempted to improve efficiency using...

10.1109/asru.2011.6163965 article EN 2011-12-01

Ordinal Common-sense Inference

OPENALEX - Publications

Sheng Zhang Rachel Rudinger Kevin Duh Benjamin Van Durme

Humans have the capacity to draw common-sense inferences from natural language: various things that are likely but not certain hold based on established discourse, and rarely stated explicitly. We propose an evaluation of automated inference extension recognizing textual entailment: predicting ordinal human responses subjective likelihood holding in a given context. describe framework for extracting knowledge corpora, which is then used construct dataset this entailment task. train neural...

10.1162/tacl_a_00068 article EN cc-by Transactions of the Association for Computational Linguistics 2017-12-01

Multi-Sentence Argument Linking

OPENALEX - Publications

Seth Ebner Patrick Xia Ryan Culkin Kyle Rawlins Benjamin Van Durme

We present a novel document-level model for finding argument spans that fill an event’s roles, connecting related ideas in sentence-level semantic role labeling and coreference resolution. Because existing datasets cross-sentence linking are small, development of our neural is supported through the creation new resource, Roles Across Multiple Sentences (RAMS), which contains 9,124 annotated events across 139 types. demonstrate strong performance on RAMS other event-related datasets.

10.18653/v1/2020.acl-main.718 article EN cc-by 2020-01-01

Inferring User Political Preferences from Streaming Communications

OPENALEX - Publications

Svitlana Volkova Glen Coppersmith Benjamin Van Durme

Existing models for social media personal analytics assume access to thousands of messages per user, even though most users author content only sporadically over time. Given this sparsity, we: (i) leverage from the local neighborhood a user; (ii) evaluate batch as function size and amount in various types neighborhoods; (iii) estimate time tweets required dynamic model predict user preferences. We show that when limited or no selfauthored data is available, language friend, retweet mention...

10.3115/v1/p14-1018 article EN 2014-01-01

Collecting Diverse Natural Language Inference Problems for Sentence Representation Evaluation

OPENALEX - Publications

Adam Poliak Aparajita Haldar Rachel Rudinger J. Edward Hu Ellie Pavlick and 2 more

Adam Poliak, Aparajita Haldar, Rachel Rudinger, J. Edward Hu, Ellie Pavlick, Aaron Steven White, Benjamin Van Durme. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018.

10.18653/v1/d18-1007 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2018-01-01

Improved Lexically Constrained Decoding for Translation and Monolingual Rewriting

OPENALEX - Publications

J. Edward Hu Huda Khayrallah Ryan Culkin Patrick Xia Tongfei Chen and 2 more

J. Edward Hu, Huda Khayrallah, Ryan Culkin, Patrick Xia, Tongfei Chen, Matt Post, Benjamin Van Durme. Proceedings of the 2019 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019.

10.18653/v1/n19-1090 article EN 2019-01-01

Social Bias in Elicited Natural Language Inferences

OPENALEX - Publications

Rachel Rudinger Chandler May Benjamin Van Durme

We analyze the Stanford Natural Language Inference (SNLI) corpus in an investigation of bias and stereotyping NLP data. The SNLI human-elicitation protocol makes it prone to amplifying stereotypical associations, which we demonstrate statistically (using pointwise mutual information) with qualitative examples.

10.18653/v1/w17-1609 article EN cc-by 2017-01-01

AMR Parsing as Sequence-to-Graph Transduction

OPENALEX - Publications

Sheng Zhang Xutai Ma Kevin Duh Benjamin Van Durme

We propose an attention-based model that treats AMR parsing as sequence-to-graph transduction. Unlike most parsers rely on pre-trained aligners, external semantic resources, or data augmentation, our proposed parser is aligner-free, and it can be effectively trained with limited amounts of labeled data. Our experimental results outperform all previously reported SMATCH scores, both 2.0 (76.3% LDC2017T10) 1.0 (70.2% LDC2014T12).

10.18653/v1/p19-1009 preprint EN cc-by 2019-01-01

Probing What Different NLP Tasks Teach Machines about Function Word Comprehension

OPENALEX - Publications

Najoung Kim Roma Patel Adam Poliak Patrick Xia Alex Wang and 7 more

Najoung Kim, Roma Patel, Adam Poliak, Patrick Xia, Alex Wang, Tom McCoy, Ian Tenney, Alexis Ross, Tal Linzen, Benjamin Van Durme, Samuel R. Bowman, Ellie Pavlick. Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics (*SEM 2019). 2019.

10.18653/v1/s19-1026 article EN cc-by 2019-01-01

Script Induction as Language Modeling

OPENALEX - Publications

Rachel Rudinger Pushpendre Rastogi Francis Ferraro Benjamin Van Durme

The narrative cloze is an evaluation metric commonly used for work on automatic script induction.While prior in this area has focused count-based methods from distributional semantics, such as pointwise mutual information, we argue that the can be productively reframed a language modeling task.By training discriminative model task, attain improvements of up to 27 percent over standard metrics.

10.18653/v1/d15-1195 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2015-01-01

Coming Soon ...