Michael Schlichtkrull

ORCID: 0000-0002-8666-0856
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Advanced Graph Neural Networks
  • Multimodal Machine Learning Applications
  • Explainable Artificial Intelligence (XAI)
  • Misinformation and Its Impacts
  • Sentiment Analysis and Opinion Mining
  • Text Readability and Simplification
  • Advanced Text Analysis Techniques
  • Business Process Modeling and Analysis
  • Biomedical Text Mining and Ontologies
  • Semantic Web and Ontologies
  • Data Quality and Management
  • Public Relations and Crisis Communication
  • Software Engineering Research
  • Web Application Security Vulnerabilities
  • Multi-Agent Systems and Negotiation
  • Bayesian Modeling and Causal Inference
  • Adversarial Robustness in Machine Learning
  • Cardiac Valve Diseases and Treatments
  • Scientific Computing and Data Management
  • Digital Media Forensic Detection
  • Authorship Attribution and Profiling
  • Logic, programming, and type systems
  • Digital Communication and Language

University of Cambridge
2021-2023

University of Amsterdam
2017-2022

University of Edinburgh
2022

PRG S&Tech (South Korea)
2021

University of Copenhagen
2015-2017

Technical University of Denmark
2013

Abstract Fact-checking has become increasingly important due to the speed with which both information and misinformation can spread in modern media ecosystem. Therefore, researchers have been exploring how fact-checking be automated, using techniques based on natural language processing, machine learning, knowledge representation, databases automatically predict veracity of claims. In this paper, we survey automated stemming from discuss its connections related tasks disciplines. process,...

10.1162/tacl_a_00454 article EN cc-by Transactions of the Association for Computational Linguistics 2022-01-01

Knowledge graphs enable a wide variety of applications, including question answering and information retrieval. Despite the great effort invested in their creation maintenance, even largest (e.g., Yago, DBPedia or Wikidata) remain incomplete. We introduce Relational Graph Convolutional Networks (R-GCNs) apply them to two standard knowledge base completion tasks: Link prediction (recovery missing facts, i.e. subject-predicate-object triples) entity classification attributes). R-GCNs are...

10.48550/arxiv.1703.06103 preprint EN other-oa arXiv (Cornell University) 2017-01-01

We review the EfficientQA competition from NeurIPS 2020. The focused on open-domain question answering (QA), where systems take natural language questions as input and return answers. aim of was to build that can predict correct answers while also satisfying strict on-disk memory budgets. These budgets were designed encourage contestants explore trade-off between storing retrieval corpora or parameters learned models. In this report, we describe motivation organization competition, best...

10.48550/arxiv.2101.00133 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Barlas Oguz, Xilun Chen, Vladimir Karpukhin, Stan Peshterliev, Dmytro Okhonko, Michael Schlichtkrull, Sonal Gupta, Yashar Mehdad, Scott Yih. Findings of the Association for Computational Linguistics: NAACL 2022.

10.18653/v1/2022.findings-naacl.115 article EN cc-by Findings of the Association for Computational Linguistics: NAACL 2022 2022-01-01

Graph neural networks (GNNs) have become a popular approach to integrating structural inductive biases into NLP models. However, there has been little work on interpreting them, and specifically understanding which parts of the graphs (e.g. syntactic trees or co-reference structures) contribute prediction. In this work, we introduce post-hoc method for predictions GNNs identifies unnecessary edges. Given trained GNN model, learn simple classifier that, every edge in layer, predicts if that...

10.48550/arxiv.2010.00577 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Rami Aly, Zhijiang Guo, Michael Sejr Schlichtkrull, James Thorne, Andreas Vlachos, Christos Christodoulopoulos, Oana Cocarascu, Arpit Mittal. Proceedings of the Fourth Workshop on Fact Extraction and VERification (FEVER). 2021.

10.18653/v1/2021.fever-1.1 article EN cc-by 2021-01-01

Fact verification has attracted a lot of attention in the machine learning and natural language processing communities, as it is one key methods for detecting misinformation. Existing large-scale benchmarks this task have focused mostly on textual sources, i.e. unstructured information, thus ignored wealth information available structured formats, such tables. In paper we introduce novel dataset benchmark, Extraction VERification Over Unstructured Structured (FEVEROUS), which consists 87,026...

10.48550/arxiv.2106.05707 preprint EN cc-by-sa arXiv (Cornell University) 2021-01-01

Attribution methods assess the contribution of inputs to model prediction. One way do so is erasure: a subset considered irrelevant if it can be removed without affecting Though conceptually simple, erasure’s objective intractable and approximate search remains expensive with modern deep NLP models. Erasure also susceptible hindsight bias: fact that an input dropped does not mean ‘knows’ dropped. The resulting pruning over-aggressive reflect how arrives at To deal these challenges, we...

10.18653/v1/2020.emnlp-main.262 article EN 2020-01-01

Fact-checking has become increasingly important due to the speed with which both information and misinformation can spread in modern media ecosystem. Therefore, researchers have been exploring how fact-checking be automated, using techniques based on natural language processing, machine learning, knowledge representation, databases automatically predict veracity of claims. In this paper, we survey automated stemming from discuss its connections related tasks disciplines. process, present an...

10.48550/arxiv.2108.11896 preprint EN other-oa arXiv (Cornell University) 2021-01-01

10.18653/v1/2024.findings-acl.580 article EN Findings of the Association for Computational Linguistics: ACL 2022 2024-01-01

Structured information is an important knowledge source for automatic verification of factual claims. Nevertheless, the majority existing research into this task has focused on textual data, and few recent inquiries structured data have been closed-domain setting where appropriate evidence each claim assumed to already retrieved. In paper, we investigate over in open-domain setting, introducing a joint reranking-and-verification model which fuses documents component. Our achieves performance...

10.18653/v1/2021.acl-long.529 preprint EN cc-by 2021-01-01

In cross-lingual dependency annotation projection, information is often lost during transfer because of early decoding. We present an end-to-end graph-based neural network parser that can be trained to reproduce matrices edge scores, which directly projected across word alignments. show our approach parsing not only simpler, but also achieves absolute improvement 2.25% averaged 10 languages compared the previous state art.

10.18653/v1/e17-1021 article EN cc-by 2017-01-01

Existing datasets for automated fact-checking have substantial limitations, such as relying on artificial claims, lacking annotations evidence and intermediate reasoning, or including published after the claim. In this paper we introduce AVeriTeC, a new dataset of 4,568 real-world claims covering fact-checks by 50 different organizations. Each claim is annotated with question-answer pairs supported available online, well textual justifications explaining how combines to produce verdict....

10.48550/arxiv.2305.13117 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Automatic enrichment of semantic taxonomies with novel data is a relatively unexplored task potential benefits in broad array natural language processing problems. Task 14 SemEval 2016 poses the challenge designing systems for this task. In paper, we describe and evaluate several machine learning constructed our participation competition. We demonstrate an f1-score 0.680 submitted — small improvement over 0.679 produced by hard baseline.

10.18653/v1/s16-1209 article EN Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022) 2016-01-01

We study open-domain question answering with structured, unstructured and semi-structured knowledge sources, including text, tables, lists bases. Departing from prior work, we propose a unifying approach that homogenizes all sources by reducing them to text applies the retriever-reader model which has so far been limited only. Our greatly improves results on knowledge-base QA tasks 11 points, compared latest graph-based methods. More importantly, demonstrate our unified (UniK-QA) is simple...

10.48550/arxiv.2012.14610 preprint EN cc-by arXiv (Cornell University) 2020-01-01

Emoticons have in the literature been shown to modify rather than provide redundancy accompanying textual message. Despite this, emoticons are often used merely as labels for sentiment classification tasks. This paper aims explore phenomenon and discover more salient emoticon-emotion associations through an embedding-based machine learning process. Using principal component analysis k-means clustering, it is how similar form groups vector space. Furthermore, a supervised strategy discovering...

10.1109/coginfocom.2015.7390651 article EN 2015-10-01

Misinformation is often conveyed in multiple modalities, e.g. a miscaptioned image. Multimodal misinformation perceived as more credible by humans, and spreads faster than its text-only counterparts. While an increasing body of research investigates automated fact-checking (AFC), previous surveys mostly focus on text. In this survey, we conceptualise framework for AFC including subtasks unique to multimodal misinformation. Furthermore, discuss related terms used different communities map...

10.18653/v1/2023.findings-emnlp.361 article EN cc-by 2023-01-01

Automated fact-checking is often presented as an epistemic tool that fact-checkers, social media consumers, and other stakeholders can use to fight misinformation. Nevertheless, few papers thoroughly discuss how. We document this by analysing 100 highly-cited papers, annotating elements related intended use, i.e., means, ends, stakeholders. find narratives leaving out some of these aspects are common, many propose inconsistent means the feasibility suggested strategies rarely has empirical...

10.18653/v1/2023.findings-emnlp.577 article EN cc-by 2023-01-01
Coming Soon ...