NFDI4DS | UHH-SEMS - Publication Details

What do Models Learn from Question Answering Datasets?

OPENALEX - Publications

Priyanka Sen Amir Saffari

While models have reached superhuman performance on popular question answering (QA) datasets such as SQuAD, they yet to outperform humans the task of itself. In this paper, we investigate if are learning reading comprehension from QA by evaluating BERT-based across five datasets. We evaluate their generalizability out-of-domain examples, responses missing or incorrect data, and ability handle variations. find that no single dataset is robust all our experiments identify shortcomings in both...

10.18653/v1/2020.emnlp-main.190 preprint EN cc-by 2020-01-01

Mintaka: A Complex, Natural, and Multilingual Dataset for End-to-End Question Answering

OPENALEX - Publications

Priyanka Sen Alham Fikri Aji Amir Saffari

We introduce Mintaka, a complex, natural, and multilingual dataset designed for experimenting with end-to-end question-answering models. Mintaka is composed of 20,000 question-answer pairs collected in English, annotated Wikidata entities, translated into Arabic, French, German, Hindi, Italian, Japanese, Portuguese, Spanish total 180,000 samples. includes 8 types complex questions, including superlative, intersection, multi-hop which were naturally elicited from crowd workers. run baselines...

10.48550/arxiv.2210.01613 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Knowledge Graph-augmented Language Models for Complex Question Answering

OPENALEX - Publications

Priyanka Sen Sandeep Mavadia Amir Saffari

Large language models have shown impressive abilities to reason over input text, however, they are prone hallucinations. On the other hand, end-to-end knowledge graph question answering (KGQA) output responses grounded in facts, but still struggle with complex reasoning, such as comparison or ordinal questions. In this paper, we propose a new method for where combine retriever based on an KGQA model that reasons retrieved facts return answer. We observe augmenting prompts KG improves...

10.18653/v1/2023.nlrse-1.1 article EN cc-by 2023-01-01

End-to-End Entity Resolution and Question Answering Using Differentiable Knowledge Graphs

OPENALEX - Publications

Amir Saffari Armin Oliya Priyanka Sen Tom Ayoola

Recently, end-to-end (E2E) trained models for question answering over knowledge graphs (KGQA) have delivered promising results using only a weakly supervised dataset. However, these are and evaluated in setting where hand-annotated entities supplied to the model, leaving important non-trivial task of entity resolution (ER) outside scope E2E learning. In this work, we extend boundaries learning KGQA include training an ER component. Our model needs text answer train, delivers stand-alone QA...

10.18653/v1/2021.emnlp-main.345 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2021-01-01

Expanding End-to-End Question Answering on Differentiable Knowledge Graphs with Intersection

OPENALEX - Publications

Priyanka Sen Armin Oliya Amir Saffari

End-to-end question answering using a differentiable knowledge graph is promising technique that requires only weak supervision, produces interpretable results, and fully differentiable. Previous implementations of this (Cohen et al, 2020) have focused on single-entity questions relation following operation. In paper, we propose model explicitly handles multiple-entity by implementing new intersection operation, which identifies the shared elements between two sets entities. We find...

10.18653/v1/2021.emnlp-main.694 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2021-01-01

XX^T matrices with independent entries

OPENALEX - Publications

Arup Bose Priyanka Sen

10.30757/alea.v20-05 article EN Latin American Journal of Probability and Mathematical Statistics 2023-01-01

ALR$^2$: A Retrieve-then-Reason Framework for Long-context Question Answering

OPENALEX - Publications

Huayang Li Pat Verga Priyanka Sen Bowen Yang Vijay Viswanathan and 3 more

The context window of large language models (LLMs) has been extended significantly in recent years. However, while the length that LLM can process grown, capability model to accurately reason over degrades noticeably. This occurs because modern LLMs often become overwhelmed by vast amount information context; when answering questions, must identify and relevant evidence sparsely distributed throughout text. To alleviate challenge long-context reasoning, we develop a retrieve-then-reason...

10.48550/arxiv.2410.03227 preprint EN arXiv (Cornell University) 2024-10-04

Uncertainty and Traffic-Aware Active Learning for Semantic Parsing

OPENALEX - Publications

Priyanka Sen Emine Yılmaz

Collecting training data for semantic parsing is a time-consuming and expensive task. As result, there growing interest in industry to reduce the number of annotations required train parser, both cut down on costs limit customer handled by annotators. In this paper, we propose uncertainty traffic-aware active learning, novel learning method that uses model confidence utterance frequencies from traffic select utterances annotation. We show our significantly outperforms baselines an internal...

10.18653/v1/2020.intexsempar-1.2 article EN cc-by 2020-01-01

Some patterned matrices with independent entries

OPENALEX - Publications

Arup Bose Koushik Saha Priyanka Sen

Patterned random matrices such as the reverse circulant, symmetric Toeplitz and Hankel their almost sure limiting spectral distribution (LSD), have attracted much attention. Under assumption that entries are taken from an i.i.d. sequence with finite variance, LSD tied together by a common thread -- $2k$th moment of limit equals weighted sum over different types pair-partitions set $\{1, 2, \ldots, 2k\}$ universal. Some results also known for sparse case. In this paper we generalise these...

10.1142/s2010326321500301 article EN Random Matrices Theory and Application 2020-08-29

Random matrices with independent entries: Beyond non-crossing partitions

OPENALEX - Publications

Arup Bose Koushik Saha Arusharka Sen Priyanka Sen

The scaled standard Wigner matrix (symmetric with mean zero, variance one i.i.d. entries), and its limiting eigenvalue distribution, namely the semi-circular have attracted much attention. [Formula: see text]th moment of limit equals number non-crossing pair-partitions set text]. There are several extensions this result in literature. In paper, we consider a unifying extension which also yields additional results. Suppose text] is an symmetric where entries independently distributed. We show...

10.1142/s2010326322500216 article EN Random Matrices Theory and Application 2022-03-08

Semantic Parsing of Disfluent Speech

OPENALEX - Publications

Priyanka Sen Isabel Groves

Speech disfluencies are prevalent in spontaneous speech. The rising popularity of voice assistants presents a growing need to handle naturally occurring disfluencies. Semantic parsing is key component for understanding user utterances assistants, yet most semantic research date focuses on written text. In this paper, we investigate disfluent speech with the ATIS dataset. We find that state-of-the-art parser does not seamlessly experiment adding real and synthetic at training time only...

10.18653/v1/2021.eacl-main.150 article EN cc-by 2021-01-01

Expanding End-to-End Question Answering on Differentiable Knowledge Graphs with Intersection

OPENALEX - Publications

Priyanka Sen Armin Oliya Amir Saffari

End-to-end question answering using a differentiable knowledge graph is promising technique that requires only weak supervision, produces interpretable results, and fully differentiable. Previous implementations of this (Cohen et al., 2020) have focused on single-entity questions relation following operation. In paper, we propose model explicitly handles multiple-entity by implementing new intersection operation, which identifies the shared elements between two sets entities. We find...

10.48550/arxiv.2109.05808 preprint EN other-oa arXiv (Cornell University) 2021-01-01

End-to-End Entity Resolution and Question Answering Using Differentiable Knowledge Graphs

OPENALEX - Publications

Armin Oliya Amir Saffari Priyanka Sen Tom Ayoola

Recently, end-to-end (E2E) trained models for question answering over knowledge graphs (KGQA) have delivered promising results using only a weakly supervised dataset. However, these are and evaluated in setting where hand-annotated entities supplied to the model, leaving important non-trivial task of entity resolution (ER) outside scope E2E learning. In this work, we extend boundaries learning KGQA include training an ER component. Our model needs text answer train, delivers stand-alone QA...

10.48550/arxiv.2109.05817 preprint EN other-oa arXiv (Cornell University) 2021-01-01