Fabian M. Suchanek

ORCID: 0000-0001-7189-2796
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Semantic Web and Ontologies
  • Natural Language Processing Techniques
  • Topic Modeling
  • Web Data Mining and Analysis
  • Data Quality and Management
  • Advanced Database Systems and Queries
  • Service-Oriented Architecture and Web Services
  • Advanced Graph Neural Networks
  • Biomedical Text Mining and Ontologies
  • Data Management and Algorithms
  • Rough Sets and Fuzzy Logic
  • Data Mining Algorithms and Applications
  • Scientific Computing and Data Management
  • Advanced Text Analysis Techniques
  • Research Data Management Practices
  • Logic, Reasoning, and Knowledge
  • Multimodal Machine Learning Applications
  • Explainable Artificial Intelligence (XAI)
  • Machine Learning in Healthcare
  • Spam and Phishing Detection
  • Text and Document Classification Technologies
  • Peer-to-Peer Network Technologies
  • Digital Rights Management and Security
  • Adversarial Robustness in Machine Learning
  • Functional Brain Connectivity Studies

Télécom Paris
2015-2024

Laboratoire Traitement et Communication de l’Information
2014-2024

Institut national de recherche en informatique et en automatique
2010-2020

ParisTech
2019

Max Planck Institute for Informatics
2006-2015

Université Paris-Saclay
2014

Max Planck Society
2006-2013

Inria Saclay - Île de France
2010-2012

Microsoft (United States)
2010

Max Planck Institute for the History of Science
2008

We present YAGO, a light-weight and extensible ontology with high coverage quality. YAGO builds on entities relations currently contains more than 1 million 5 facts. This includes the Is-A hierarchy as well non-taxonomic between (such HASONEPRIZE). The facts have been automatically extracted from Wikipedia unified WordNet, using carefully designed combination of rule-based heuristic methods described in this paper. resulting knowledge base is major step beyond WordNet: quality by adding...

10.1145/1242572.1242667 preprint EN 2007-05-08

Recent advances in information extraction have led to huge knowledge bases (KBs), which capture a machine-readable format. Inductive Logic Programming (ILP) can be used mine logical rules from the KB. These help deduce and add missing While ILP is mature field, mining KBs different two aspects: First, current rule systems are easily overwhelmed by amount of data (state-of-the art cannot even run on today's KBs). Second, usually requires counterexamples. KBs, however, implement open world...

10.1145/2488388.2488425 article EN 2013-05-13

One of the main challenges that Semantic Web faces is integration a growing number independently designed ontologies. In this work, we present paris, an approach for automatic alignment paris aligns not only instances, but also relations and classes. Alignments at instance level cross-fertilize with alignments schema level. Thereby, our system provides truly holistic solution to problem ontology alignment. The heart probabilistic, i.e., measure degrees matchings based on probability...

10.14778/2078331.2078332 article EN Proceedings of the VLDB Endowment 2011-11-01

We present YAGO2, an extension of the YAGO knowledge base with focus on temporal and spatial knowledge. It is automatically built from Wikipedia, GeoNames, WordNet, contains nearly 10 million entities events, as well 80 facts representing general world An enhanced data representation introduces time location first-class citizens. The wealth spatio-temporal information in can be explored either graphically or through a special time- space-aware query language.

10.1145/1963192.1963296 article EN 2011-03-28

Reaching a global view of brain organization requires assembling evidence on widely different mental processes and mechanisms. The variety human neuroscience concepts terminology poses fundamental challenge to relating imaging results across the scientific literature. Existing meta-analysis methods perform statistical tests sets publications associated with particular concept. Thus, large-scale meta-analyses only tackle single terms that occur frequently. We propose new paradigm, focusing...

10.7554/elife.53385 article EN cc-by eLife 2020-03-04

The Web has the potential to become world's largest knowledge base. In order unleash this potential, wealth of information available on needs be extracted and organized. There is a need for new querying techniques that are simple yet more expressive than those provided by standard keyword-based search engines. Searching rather pages consider inherent semantic structures like entities (person, organization, etc.) relationships (isA, located In, etc.). paper, we propose NAGA, engine. NAGA...

10.1109/icde.2008.4497504 article EN 2008-04-01

This paper presents SOFIE, a system for automated ontology extension. SOFIE can parse natural language documents, extract ontological facts from them and link the into an ontology. uses logical reasoning on existing knowledge new in order to disambiguate words their most probable meaning, reason meaning of text patterns take account world axioms. allows check plausibility hypotheses avoid inconsistencies with The framework unites paradigms pattern matching, word sense disambiguation one...

10.1145/1526709.1526794 article EN 2009-04-20

The World Wide Web provides a nearly endless source of knowledge, which is mostly given in natural language. A first step towards exploiting this data automatically could be to extract pairs semantic relation from text documents - for example all person and her birthdate. One strategy task find patterns that express the relation, generalize these patterns, apply them corpus new pairs. In paper, we show approach profits significantly when deep linguistic structures are used instead surface...

10.1145/1150402.1150492 article EN 2006-08-20

This paper presents a method for automatically constructing large commonsense knowledge base, called WebChild, from Web contents. WebChild contains triples that connect nouns with adjectives via fine-grained relations like hasShape, hasTaste, evokesEmotion, etc. The arguments of these assertions, and adjectives, are disambiguated by mapping them onto their proper WordNet senses. Our is based on semi-supervised Label Propagation over graphs noisy candidate assertions. We derive seeds pattern...

10.1145/2556195.2556245 preprint EN 2014-02-18

Large graphs and networks are abundant in modern information systems: entity-relationship over relational data or Web-extracted entities, biological networks, social online communities, knowledge bases, many more. Often such comes with expressive node edge labels that allow an interpretation as a semantic graph, weights reflect the strengths of relations between entities. Finding close relationships given set two, three, more entities is important building block for search, ranking, analysis...

10.1109/icde.2009.64 article EN Proceedings - International Conference on Data Engineering 2009-03-01

Open information extraction approaches have led to the creation of large knowledge bases from Web. The problem with such methods is that their entities and relations are not canonicalized, leading redundant ambiguous facts. For example, they may store {Barack Obama, was born, Honolulu {Obama, place birth, Honolulu}. In this paper, we present an approach based on machine learning can canonicalize IE triples, by clustering synonymous names phrases.

10.1145/2661829.2662073 preprint EN 2014-11-03

Knowledge bases such as Wikidata, DBpedia, or YAGO contain millions of entities and facts. In some knowledge bases, the correctness these facts has been evaluated. However, much less is known about their completeness, i.e., proportion real that cover. this work, we investigate different signals to identify areas where a base complete. We show can combine in rule mining approach, which allows us predict may be missing. also completeness predictions help other applications fact prediction.

10.1145/3018661.3018739 preprint EN 2017-02-02

Equipping machines with comprehensive knowledge of the world's entities and their relationships has been a long-standing goal AI.Over last decade, large-scale bases, also known as graphs, have automatically constructed from web contents text sources, become key asset for search engines.This machine can be harnessed to semantically interpret textual phrases in news, social media tables, contributes question answering, natural language processing data analytics.This article surveys fundamental...

10.1561/1900000064 article EN Foundations and Trends in Databases 2021-01-01

Entity alignment (EA) finds equivalent entities that are located in different knowledge graphs (KGs), which is an essential step to enhance the quality of KGs, and hence significance downstream applications (e.g., question answering recommendation). Recent years have witnessed a rapid increase EA approaches, yet relative performance them remains unclear, partly due incomplete empirical evaluations, as well fact comparisons were carried out under settings (i.e., datasets, information used...

10.1109/tkde.2020.3018741 article EN cc-by IEEE Transactions on Knowledge and Data Engineering 2020-01-01

This paper aims to quantify two common assumptions about social tagging: (1) that tags are "meaningful" and (2) the tagging process is influenced by tag suggestions. For (1), we analyze semantic properties of relationship between content tagged page. Our analysis based on a corpus search keywords, contents, titles, applied several thousand popular Web pages. Among other results, find more page tend be meaningful ones. (2), develop model how influence suggestions can measured. From user study...

10.1145/1458082.1458114 article EN 2008-10-26

We present ESTER, a modular and highly efficient system for combined full-text ontology search. ESTER builds on query engine that supports two basic operations: prefix search join. Both of these can be implemented very efficiently with compact index, yet in combination provide powerful querying capabilities. show how answer SPARQL graph-pattern queries the by reducing them to small number operations. further natural blend such semantic ordinary queries. Moreover, operation allows fully...

10.1145/1277741.1277856 article EN 2007-07-23

This paper gives an overview on the YAGO-NAGA approach to information extraction for building a conveniently searchable, large-scale, highly accurate knowledge base of common facts. YAGO harvests infoboxes and category names Wikipedia facts about individual entities, it reconciles these with taxonomic backbone WordNet in order ensure that all entities have proper classes class system is consistent. Currently, contains 19 million instances binary relations 1.95 entities. Based intensive...

10.1145/1519103.1519110 article EN ACM SIGMOD Record 2009-03-20
Coming Soon ...