Md Rashad Al Hasan Rony, Liubov Kovriguina, Debanjan Chaudhuri, Ricardo Usbeck, Jens Lehmann. Proceedings of the 60th Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2022.
We present GERBIL, an evaluation framework for semantic entity annotation. The rationale behind our is to provide developers, end users and researchers with easy-to-use interfaces that allow the agile, fine-grained uniform of annotation tools on multiple datasets. By these means, we aim ensure both tool developers can derive meaningful insights pertaining extension, integration use applications. In particular, GERBIL provides comparable results so as them easily discover strengths weaknesses...
The ability to compare systems from the same domain is of central importance for their introduction into complex applications. In domains named entity recognition and linking, large number orthogonal evaluation w.r.t. measures datasets has led an unclea r landscape regarding abilities weaknesses different approaches. We present Gerbil – improved platform repeatable, storable citable semantic annotation experiments its extension since being release. narrowed this gap by generating concise,...
Daniel Gillick, Sayali Kulkarni, Larry Lansing, Alessandro Presta, Jason Baldridge, Eugene Ie, Diego Garcia-Olano. Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL). 2019.
It is crucial for the success of a search-driven web application to answer users' queries in best possible way. A common approach use click models guessing relevance search results. However, these are imprecise and waive valuable information one can gain from non-click user interactions. We introduce TellMyRelevance!---a novel automatic end-to-end pipeline tracking cursor interactions at client, analyzing learning according models. Yet, depend on layout results page involved, which makes...
Over the past years, several challenges and calls for research projects have pointed out dire need pushing natural language interfaces. In this context, importance of Semantic Web data as a premier knowledge source is rapidly increasing. But we are still far from having accurate interfaces that allow handling complex information needs in user-centric highly performant manner. The development such requires collaboration range different fields, including processing, extraction, base...
Since the inception of Open Linguistics Working Group in 2010, there have been numerous efforts transforming language resources into Linked Data. The research field Linguistic Data (LLD) has gained importance, visibility and impact, with (LLOD) cloud gathering nowadays over 200 resources. With this increasing growth, new challenges emerged concerning particular domain task applications, quality dimensions, linguistic features to take account. This special issue aims review summarize progress...
It is our great pleasure to welcome you the WWW 2018 Challenges Track. first time that conference includes such a track, which aim was showcase maturity of state art on tasks common Web community and adjacent academic communities, in controlled setting rigorous evaluation. Through call for challenge organisation, we also wanted see open questions might be seen as most relevant this today, how groups researchers come together around shared resources (e.g. datasets) address those hands-on,...
A question answering system is the most promising way of retrieving data from available knowledge base to end-users get appropriate result for their questions. Many systems convert questions into triples which are mapped answer derived. However, these do not express semantic representation question, due answers cannot be located. To handle this, a template-based approach proposed that classifies types and finds SPARQL query template each type including comparatives superlatives. The built...
A partir de fragmentos del comentario nuestra traducción colectiva „Grenzen übersetzen, translatorisches Handeln, Brückensprachen schaffen“ [Traducir fronteras, actuar manera traslativa, crear lenguas puente] Borderlands/La Frontera: The New Mestiza (1987) Gloria Anzaldúa pretendemos mostrar el impacto como forma articulación que se declara responsable la construcción sentido y subjetividad en texto traducido; al mismo tiempo, exponer nuestras estrategias explicar algunos los procesos...
This dataset is a collection of existing KGQA datasets in the form of the huggingface datasets library, aiming to provide an easy-to-use access to them.
This is the RDF dump of DBLP released on August 1, 2022. The DBLP RDF dump is published to allow fair and replicable evaluation of KGQA systems with the DBLP-QuAD dataset.
Ex-Post-Facto Analysis of DBpedia chatbot: http://chat.dbpedia.org/
WD50K dataset: An hyper-relational dataset derived from Wikidata statements. The dataset is constructed by the following procedure based on the [Wikidata RDF dump](https://dumps.wikimedia.org/wikidatawiki/20190801/) of August 2019: - A set of seed nodes corresponding to entities from FB15K-237 having a direct mapping in Wikidata (P646 "Freebase ID") is extracted from the dump. - For each seed node, all statements whose main object and qualifier values corresponding to wikibase:Item are...
This data dump of Wikidata is published to allow fair and replicable evaluation of KGQA systems with the QALD-10 benchmark. QALD-10 is newly released and was used in the QALD-10 Challenge. Anyone interested in evaluating their KGQA systems with QALD-10 can download this dump and set up a local Wikidata endpoint in their server.
The repo contains our final submission's refactored code for the Rich Context Competition. On a high-level, it contains modules for 3 different components of the pipeline: Preprocessing Dataset identification using a combination of simple dataset-search and CRFs based on Rasa-NLU Identification of methods and fields using vectorization and cosine similarity
WD50K dataset: An hyper-relational dataset derived from Wikidata statements. The dataset is constructed by the following procedure based on the [Wikidata RDF dump](https://dumps.wikimedia.org/wikidatawiki/20190801/) of August 2019: - A set of seed nodes corresponding to entities from FB15K-237 having a direct mapping in Wikidata (P646 "Freebase ID") is extracted from the dump. - For each seed node, all statements whose main object and qualifier values corresponding to wikibase:Item are...
This is the RDF dump of DBLP released on August 1, 2022. The DBLP RDF dump is published to allow fair and replicable evaluation of KGQA systems with the DBLP-QuAD dataset.
In this work we create a question answering dataset over the DBLP scholarly knowledge graph (KG). DBLP is an on-line reference for bibliographic information on major computer science publications that indexes over 4.4 million publications, published by more than 2.2 million authors. Our dataset consists of 10,000 question answer pairs with the corresponding SPARQL queries which can be executed over the DBLP KG to fetch the correct answer. To the best of our knowledge, this is the first QA...
This is the first public release of the RICardo dataset under the licence odbl v1.0. This dataset is precisely described un der the data package format. This release includes 368,871 bilateral or total trade flows from 1787 to 1938 for 373 reporting entities. It also contains python scripts used to compile and filter the flows to fuel our exploratory data analysis online tool.
Ex-Post-Facto Analysis of DBpedia chatbot: http://chat.dbpedia.org/
This material constitutes the software tools used in the analysis submitted for publication in JGR:Planets "Convective vortices and dust devils detected and characterized by Mars 2020" by Hueso et al. The manuscript is submitted to JGR: Planets on 3 August, 2022
Dataset for the study "Antimicrobial susceptibility testing reveals reduced susceptibility to azithromycin and other antibiotics in Legionella pneumophila serogroup 1 isolates from Portugal"
color() now maintains the names of the input vector, allowing plot() to use the color names rather than the hex values when label = TRUE. You can also provide label with a custom set of color labels. Unnamed colors are labelled with their hex values (@gadenbuie, #27). Printing color objects is now powered by {cli}, which has superseded {crayon} (jack-davison, #28).
<pre><strong>WD50K dataset: An hyper-relational dataset derived from Wikidata statements</strong>. The is constructed by the following procedure based on [Wikidata RDF dump](https://dumps.wikimedia.org/wikidatawiki/20190801/) of August 2019: - A set seed nodes corresponding to entities FB15K-237 having a direct mapping in (<strong>P646</strong> "Freebase ID") extracted dump. For each node, all statements whose <strong>main</strong> object and <strong>qualifier</strong> values...
<pre><strong>WD50K dataset: An hyper-relational dataset derived from Wikidata statements</strong>. The is constructed by the following procedure based on [Wikidata RDF dump](https://dumps.wikimedia.org/wikidatawiki/20190801/) of August 2019: - A set seed nodes corresponding to entities FB15K-237 having a direct mapping in (<strong>P646</strong> "Freebase ID") extracted dump. For each node, all statements whose <strong>main</strong> object and <strong>qualifier</strong> values...
In the face of continuously changing contextual conditions and ubiquitous disruptive crisis events, the concept of resilience refers to some of the most urgent, challenging, and interesting issues of nowadays society. Economic value networks, technical infrastructures, health systems, and social textures alike need to unfold capacities to withstand, adapt, recover, or even refine and transform themselves to stay ahead of changes.
Dialogue systems for interaction with humans have been enjoying increased popularity in the research and industry fields. To this day, the best way to estimate their success is through means of human evaluation and not automated approaches, despite the abundance of work done in the field. In this paper, we investigate the effectiveness of perceiving dialogue evaluation as an anomaly detection task. The paper looks into four dialogue modeling approaches and how their objective functions...
Knowledge graphs are increasingly used in a plethora of downstream tasks or in the augmentation of statistical models to improve factuality. However, social biases are engraved in these representations and propagate downstream. We conducted a critical analysis of literature concerning biases at different steps of a knowledge graph lifecycle. We investigated factors introducing bias, as well as the biases that are rendered by knowledge graphs and their embedded versions afterward. Limitations...
The past years have seen a growing amount of research on question answering (QA) over Semantic Web data, shaping an interaction paradigm that allows end users to profit from the expressive power of Semantic Web standards while, at the same time, hiding their complexity behind an intuitive and easy-to-use interface. On the other hand, the growing amount of data has led to a heterogeneous data landscape where QA systems struggle to keep up with the volume, variety and veracity of the...
With regard to the wider area of AI/LT platform interoperability, we concentrate on two core aspects: (1) cross-platform search and discovery of resources and services; (2) composition of cross-platform service workflows. We devise five different levels (of increasing complexity) of platform interoperability that we suggest to implement in a wider federation of AI/LT platforms. We illustrate the approach using the five emerging AI/LT platforms AI4EU, ELG, Lynx, QURATOR and SPEAKER.
The automatic evaluation of open-domain dialogues remains a largely unsolved challenge. Despite the abundance of work done in the field, human judges have to evaluate dialogues quality. As a consequence, performing such evaluations at scale is usually expensive. This work investigates using a deep-learning model trained on the General Language Understanding Evaluation (GLUE) benchmark to serve as a quality indication of open-domain dialogues. The aim is to use the various GLUE tasks as...
In the realm of Machine Learning and Deep Learning, there is a need for high-quality annotated data to train and evaluate supervised models. An extensive number of annotation tools have been developed to facilitate the data labelling process. However, finding the right tool is a demanding task involving thorough searching and testing. Hence, to effectively navigate the multitude of tools, it becomes essential to ensure their findability, accessibility, interoperability, and reusability...
