- Semantic Web and Ontologies
- Advanced Database Systems and Queries
- Data Management and Algorithms
- Web Data Mining and Analysis
- Topic Modeling
- Data Quality and Management
- Natural Language Processing Techniques
- Algorithms and Data Compression
- Information Retrieval and Search Behavior
- Service-Oriented Architecture and Web Services
- Quality of Life Measurement
- Recommender Systems and Techniques
- Multi-Agent Systems and Negotiation
- Scientific Computing and Data Management
- Expert finding and Q&A systems
- Advanced Text Analysis Techniques
- Distributed systems and fault tolerance
- Biomedical Text Mining and Ontologies
- Advanced Image and Video Retrieval Techniques
- Caching and Content Delivery
- Peer-to-Peer Network Technologies
- Data Mining Algorithms and Applications
- Text and Document Classification Technologies
- Software Engineering Research
- Mobile Crowdsensing and Crowdsourcing
Universität Trier
2013-2024
University of Passau
2013-2016
Max Planck Institute for Informatics
2006-2015
Max Planck Society
2006-2015
Saarland University
2000-2013
St. Mary's Hospital
1973
University of Basel
1970
The increasing interest in Semantic Web technologies has led not only to a rapid growth of semantic data on the but also an number backend applications relying efficient query processing. Confronted with such trend, existing centralized state-of-the-art systems for storing RDF and processing SPARQL queries are no longer sufficient. In this paper, we introduce Partout, distributed engine fast cluster machines. We propose effective approach fragmenting sets based log allocating fragments hosts...
With the increasing popularity of Semantic Web, more and data becomes available in RDF with SPARQL as a query language. Data sets, however, can become too big to be managed queried on single server scalable way. Existing distributed stores approach this problem using partitioning, aiming at limiting communication between servers exploiting parallelism. This paper proposes engine that combines graph partitioning technique workload-aware replication triples across partitions, enabling...
Online communities have become popular for publishing and searching content, as well finding connecting to other users. User-generated content includes, example, personal blogs, bookmarks, digital photos. These items can be annotated rated by different users, these social tags derived user-specific scores leveraged relevant discovering subjectively interesting items. Moreover, the relationships among users also taken into consideration ranking search results, intuition being that you trust...
Top-k query processing is an important building block for ranked retrieval, with applications ranging from text and data integration to distributed aggregation of network logs sensor data. queries operate on index lists a query's elementary conditions aggregate scores result candidates. One the best implementation methods in this setting family threshold algorithms, which aim terminate scans as early possible based lower upper bounds final This procedure performs sequential disk accesses...
The HOPI index, a connection index for XML documents based on the concept of 2-hop cover, provides space- and time-efficient reachability tests along ancestor, descendant, link axes to support path expressions with wildcards in search engines. This paper presents enhanced algorithms building HOPI, shows how augment distance information, discusses incremental maintenance. Our experiments show substantial improvements over existing divide-and-conquer algorithm creation, low space overhead...
Online communities have recently become a popular tool for publishing and searching content, as well finding connecting to other users that share common interests. The content is typically user-generated includes, example, personal blogs, bookmarks, digital photos. A particularly intriguing type of annotations (tags) items, these concise string descriptions allow reasonings about the interests user who created but also generated annotations. This paper presents framework cast different...
Recent IR extensions to XML query languages such as Xpath 1.0 Full-Text or the NEXI language of INEX benchmark series reflect emerging interest in IR-style ranked retrieval over semistructured data. TopX is a top-k engine for text and It terminates execution soon it can safely determine k top-ranked result elements according monotonic score aggregation function with respect multidimensional query. efficiently supports vague search on both content- structure-oriented conditions dynamic...
The success of knowledge-sharing communities like Wikipedia and the advances in automatic information extraction from textual Web sources have made it possible to build large "knowledge repositories" such as DBpedia, Freebase, YAGO. These collections can be viewed graphs entities relationships (ER graphs) represented a set subject-property-object (SPO) triples Semantic-Web data model RDF. Queries expressed W3C-endorsed SPARQL language or by similarly designed graph-pattern search. However,...
Given an entity represented by a single node q in semantic knowledge graph D, the Graphical Entity Summarisation problem (GES) consists selecting out of D very small surrounding S that constitutes generic summary information concerning with given limit on size S. This article concerns role diversity this quite novel problem. It gives overview concept retrieval, and proposes how to adapt it GES. A measure for GES, called ALC, is defined two algorithms presented, baseline, diversity-oblivious...
Named entity recognition is an important task when constructing knowledge bases from unstructured data sources. Whereas detection methods mostly rely on extensive training data, Large Language Models (LLMs) have paved the way towards approaches that zero-shot learning (ZSL) or few-shot (FSL) by taking advantage of capabilities LLMs acquired during pretraining. Specifically, in very specialized scenarios where large-scale not available, ZSL / FSL opens new opportunities. This paper follows...