NFDI4DS | UHH-SEMS - Publication Details

Gerhard Weikum

ORCID: 0000-0003-4959-6098

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5088135366

Research Areas

Topic Modeling
Natural Language Processing Techniques
Semantic Web and Ontologies
Advanced Database Systems and Queries
Web Data Mining and Analysis
Data Management and Algorithms
Peer-to-Peer Network Technologies
Distributed systems and fault tolerance
Advanced Data Storage Technologies
Data Quality and Management
Caching and Content Delivery
Advanced Graph Neural Networks
Service-Oriented Architecture and Web Services
Distributed and Parallel Computing Systems
Advanced Text Analysis Techniques
Business Process Modeling and Analysis
Algorithms and Data Compression
Recommender Systems and Techniques
Data Mining Algorithms and Applications
Text and Document Classification Technologies
Complex Network Analysis Techniques
Biomedical Text Mining and Ontologies
Spam and Phishing Detection
Sentiment Analysis and Opinion Mining
Parallel Computing and Optimization Techniques

Max Planck Institute for Informatics
2015-2024

Max Planck Society
2013-2024

Robert Bosch (India)
2023

University of Amsterdam
2023

Max Planck Institute for the History of Science
2008-2021

Microsoft Research (United Kingdom)
1998-2019

Institute of Informatics of the Slovak Academy of Sciences
2018

Cornell University
1995-2017

Klinikum Saarbrücken
2017

Hewlett-Packard (United States)
2011

YAGO2: A spatially and temporally enhanced knowledge base from Wikipedia

OPENALEX - Publications

Johannes Hoffart Fabian M. Suchanek Klaus Berberich Gerhard Weikum

10.1016/j.artint.2012.06.001 article EN publisher-specific-oa Artificial Intelligence 2012-06-18

The LRU-K page replacement algorithm for database disk buffering

OPENALEX - Publications

Elizabeth O’Neil Patrick O’Neil Gerhard Weikum

This paper introduces a new approach to database disk buffering, called the LRU-K method. The basic idea of is keep track times last K references popular pages, using this information statistically estimate interarrival on page by basis. Although performs optimal statistical inference under relatively standard assumptions, it fairly simple and incurs little bookkeeping overhead. As we demonstrate with simulation experiments, algorithm surpasses conventional buffering algorithms in...

10.1145/170035.170081 article EN 1993-06-01

YAGO: A Large Ontology from Wikipedia and WordNet

OPENALEX - Publications

Fabian M. Suchanek Gjergji Kasneci Gerhard Weikum

10.1016/j.websem.2008.06.001 article EN Journal of Web Semantics 2008-09-01

The RDF-3X engine for scalable management of RDF data

OPENALEX - Publications

Thomas Neumann Gerhard Weikum

10.1007/s00778-009-0165-y article EN The VLDB Journal 2009-08-31

Foundations of statistical natural language processing

OPENALEX - Publications

Gerhard Weikum

No abstract available.

10.1145/601858.601867 article FR ACM SIGMOD Record 2002-09-01

RDF-3X

OPENALEX - Publications

Thomas Neumann Gerhard Weikum

RDF is a data representation format for schema-free structured information that gaining momentum in the context of Semantic-Web corpora, life sciences, and also Web 2.0 platforms. The "pay-as-you-go" nature flexible pattern-matching capabilities its query language SPARQL entail efficiency scalability challenges complex queries including long join paths. This paper presents RDF-3X engine, an implementation achieves excellent performance by pursuing RISC-style architecture with streamlined...

10.14778/1453856.1453927 article EN Proceedings of the VLDB Endowment 2008-08-01

YAGO2

OPENALEX - Publications

Johannes Hoffart Fabian M. Suchanek Klaus Berberich Edwin Lewis-Kelham Gerard de Melo and 1 more

We present YAGO2, an extension of the YAGO knowledge base with focus on temporal and spatial knowledge. It is automatically built from Wikipedia, GeoNames, WordNet, contains nearly 10 million entities events, as well 80 facts representing general world An enhanced data representation introduces time location first-class citizens. The wealth spatio-temporal information in can be explored either graphically or through a special time- space-aware query language.

10.1145/1963192.1963296 article EN 2011-03-28

DeClarE: Debunking Fake News and False Claims using Evidence-Aware Deep Learning

OPENALEX - Publications

Kashyap Popat Subhabrata Mukherjee Andrew Yates Gerhard Weikum

Misinformation such as fake news is one of the big challenges our society. Research on automated fact-checking has proposed methods based supervised learning, but these approaches do not consider external evidence apart from labeled training instances. Recent counter this deficit by considering sources related to a claim. However, require substantial feature modeling and rich lexicons. This paper overcomes limitations prior work with an end-to-end model for evidence-aware credibility...

10.18653/v1/d18-1003 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2018-01-01

Equity of Attention

OPENALEX - Publications

Asia J. Biega Krishna P. Gummadi Gerhard Weikum

Rankings of people and items are at the heart selection-making, match-making, recommender systems, ranging from employment sites to sharing economy platforms. As ranking positions influence amount attention ranked subjects receive, biases in rankings can lead unfair distribution opportunities resources, such as jobs or income. This paper proposes new measures mechanisms quantify mitigate unfairness a bias inherent all rankings, namely, position bias, which leads disproportionately less being...

10.1145/3209978.3210063 preprint EN 2018-06-27

Automated Template Generation for Question Answering over Knowledge Graphs

OPENALEX - Publications

Abdalghani Abujabal Mohamed Yahya Mirek Riedewald Gerhard Weikum

Templates are an important asset for question answering over knowledge graphs, simplifying the semantic parsing of input utterances and generating structured queries interpretable answers. State-of-the-art methods rely on hand-crafted templates with limited coverage. This paper presents QUINT, a system that automatically learns utterance-query solely from user questions paired their Additionally, QUINT is able to harness language compositionality complex without having any entire question....

10.1145/3038912.3052583 article EN 2017-04-03

KORE

OPENALEX - Publications

Johannes Hoffart Stephan Seufert Dat Ba Nguyen Martin Theobald Gerhard Weikum

Measuring the semantic relatedness between two entities is basis for numerous tasks in IR, NLP, and Web-based knowledge extraction. This paper focuses on disambiguating names a Web or text document by jointly mapping all onto semantically related registered base. To this end, we have developed novel notion of represented as sets weighted (multi-word) keyphrases, with consideration partially overlapping phrases. measure improves quality prior link-based models, also eliminates need (usually...

10.1145/2396761.2396832 article EN 2012-10-29

Where the Truth Lies

OPENALEX - Publications

Kashyap Popat Subhabrata Mukherjee Jannik Strötgen Gerhard Weikum

The web is a huge source of valuable information. However, in recent times, there an increasing trend towards false claims social media, other web-sources, and even news. Thus, factchecking websites have become increasingly popular to identify such misinformation based on manual analysis. Recent research proposed methods assess the credibility automatically. are major limitations: most works assume be structured form, few deal with textual but require that sources evidence or...

10.1145/3041021.3055133 article EN 2017-01-01

Credibility Assessment of Textual Claims on the Web

OPENALEX - Publications

Kashyap Popat Subhabrata Mukherjee Jannik Strötgen Gerhard Weikum

There is an increasing amount of false claims in news, social media, and other web sources. While prior work on truth discovery has focused the case checking factual statements, this paper addresses novel task assessing credibility arbitrary made natural-language text - open-domain setting without any assumptions about structure claim, or community where it made. Our solution based automatically finding sources news feeding these into a distantly supervised classifier for claim (i.e., true...

10.1145/2983323.2983661 article EN 2016-10-24

The LRU-K page replacement algorithm for database disk buffering

OPENALEX - Publications

Elizabeth O’Neil Patrick O’Neil Gerhard Weikum

10.1145/170036.170081 article EN ACM SIGMOD Record 1993-06-01

Principles and realization strategies of multilevel transaction management

OPENALEX - Publications

Gerhard Weikum

One of the demands database system transaction management is to achieve a high degree concurrency by taking into consideration semantics high-level operations. On other hand, implementation such operations must pay attention conflicts on storage representation levels below. To meet these requirements in layered architecture, we propose multilevel utilizing layer-specific semantics. Based theoretical notion serializability, family control strategies developed. Suitable recovery protocols are...

10.1145/103140.103145 article EN ACM Transactions on Database Systems 1991-03-01

NAGA: Searching and Ranking Knowledge

OPENALEX - Publications

Gjergji Kasneci Fabian M. Suchanek Georgiana Ifrim Maya Ramanath Gerhard Weikum

The Web has the potential to become world's largest knowledge base. In order unleash this potential, wealth of information available on needs be extracted and organized. There is a need for new querying techniques that are simple yet more expressive than those provided by standard keyword-based search engines. Searching rather pages consider inherent semantic structures like entities (person, organization, etc.) relationships (isA, located In, etc.). paper, we propose NAGA, engine. NAGA...

10.1109/icde.2008.4497504 article EN 2008-04-01

Scalable join processing on very large RDF graphs

OPENALEX - Publications

Thomas Neumann Gerhard Weikum

With the proliferation of RDF data format, engines for query processing are faced with very large graphs that contain hundreds millions triples. This paper addresses resulting scalability problems. Recent prior work along these lines has focused on indexing and other physical-design issues. The current focuses join processing, as fine-grained schema-relaxed use often entails star- chain-shaped queries many input streams from index scans.

10.1145/1559845.1559911 article EN 2009-06-29

Coming Soon ...