NFDI4DS | UHH-SEMS - Publication Details

Georgia Koutrika

ORCID: 0000-0002-7377-0116

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5012623045

Research Areas

Advanced Database Systems and Queries
Data Management and Algorithms
Semantic Web and Ontologies
Recommender Systems and Techniques
Web Data Mining and Analysis
Data Quality and Management
Topic Modeling
Scientific Computing and Data Management
Mobile Crowdsensing and Crowdsourcing
Data Mining Algorithms and Applications
Big Data and Business Intelligence
Natural Language Processing Techniques
Advanced Bandit Algorithms Research
Expert finding and Q&A systems
Data Stream Mining Techniques
Advanced Graph Neural Networks
Ethics and Social Impacts of AI
Complex Network Analysis Techniques
Peer-to-Peer Network Technologies
Open Education and E-Learning
Privacy-Preserving Technologies in Data
Multimedia Communication and Technology
Information Retrieval and Search Behavior
Advanced Text Analysis Techniques
Constraint Satisfaction and Optimization

Athena Research and Innovation Center In Information Communication & Knowledge Technologies
2017-2024

Hewlett-Packard (United States)
2010-2021

Arizona State University
2016

Utah State University
2016

University of Massachusetts Amherst
2016

IBM (United States)
2012-2013

IBM Research - Almaden
2010-2012

Stanford University
2007-2011

National and Kapodistrian University of Athens
2004-2011

University of Westminster
2009

Can social bookmarking improve web search?

OPENALEX - Publications

Paul Heymann Georgia Koutrika Héctor García-Molina

Social bookmarking is a recent phenomenon which has the potential to give us great deal of data about pages on web. One major question whether that can be used augment systems like web search. To answer this question, over past year we have gathered what believe largest dataset from social site yet analyzed by academic researchers. Our represents forty million bookmarks del.icio.us. We contribute characterization posts del.icio. us: how many exist (about 115 million), fast it growing, and...

10.1145/1341531.1341558 article EN 2008-01-01

A survey on deep learning approaches for text-to-SQL

OPENALEX - Publications

George Katsogiannis-Meimarakis Georgia Koutrika

Abstract To bridge the gap between users and data, numerous text-to-SQL systems have been developed that allow to pose natural language questions over relational databases. Recently, novel are adopting deep learning methods with very promising results. At same time, several challenges remain open making this area an active flourishing field of research development. make real progress in building systems, we need de-mystify what has done, understand how when each approach can be used, and,...

10.1007/s00778-022-00776-8 article EN cc-by The VLDB Journal 2023-01-23

Fighting Spam on Social Web Sites: A Survey of Approaches and Future Challenges

OPENALEX - Publications

Paul Heymann Georgia Koutrika Héctor García-Molina

In recent years, social Web sites have become important components of the Web. With their success, however, has come a growing influx spam. If left unchecked, spam threatens to undermine resource sharing, interactivity, and openness. This article surveys three categories potential countermeasures - those based on detection, demotion, prevention. Although many these been proposed before for email spam, authors find that applicability differs.

10.1109/mic.2007.125 article EN IEEE Internet Computing 2007-11-01

Entity resolution with iterative blocking

OPENALEX - Publications

Steven Euijong Whang David Menestrina Georgia Koutrika Martin Theobald Héctor García-Molina

Entity Resolution (ER) is the problem of identifying which records in a database refer to same real-world entity. An exhaustive ER process involves computing similarities between pairs records, can be very expensive for large datasets. Various blocking techniques used enhance performance by dividing into blocks multiple ways and only comparing within block. However, most separately do not exploit results other blocks. In this paper, we propose an iterative framework where are reflected...

10.1145/1559845.1559870 article EN 2009-06-29

FlexRecs

OPENALEX - Publications

Georgia Koutrika Benjamin Bercovitz Héctor García-Molina

Recommendation systems have become very popular but most recommendation methods are `hard-wired' into the system making experimentation with and implementation of new paradigms cumbersome. In this paper, we propose FlexRecs, a framework that decouples definition process from its execution supports flexible recommendations over structured data. approach can be defined declaratively as high-level parameterized workflow comprising traditional relational operators generate or combine...

10.1145/1559845.1559923 article EN 2009-06-29

A survey on representation, composition and application of preferences in database systems

OPENALEX - Publications

Kostas Stefanidis Georgia Koutrika Evaggelia Pitoura

Preferences have been traditionally studied in philosophy, psychology, and economics applied to decision making problems. Recently, they attracted the attention of researchers other fields, such as databases where capture soft criteria for queries. Databases bring a whole fresh perspective study preferences, both computational representational. From representational perspective, central question is how we can effectively represent preferences incorporate them database querying. look at...

10.1145/2000824.2000829 article EN ACM Transactions on Database Systems 2011-08-01

Meta-Blocking: Taking Entity Resolutionto the Next Level

OPENALEX - Publications

George Papadakis Georgia Koutrika Themis Palpanas Wolfgang Nejdl

Entity Resolution is an inherently quadratic task that typically scales to large data collections through blocking. In the context of highly heterogeneous information spaces, blocking methods rely on redundancy in order ensure high effectiveness at cost lower efficiency (i.e., more comparisons). This effect partially ameliorated by coarse-grained block processing techniques discard entire blocks either a-priori or during resolution process. this paper, we introduce meta-blocking as a generic...

10.1109/tkde.2013.54 article EN IEEE Transactions on Knowledge and Data Engineering 2013-03-27

Fairness in rankings and recommendations: an overview

OPENALEX - Publications

Evaggelia Pitoura Kostas Stefanidis Georgia Koutrika

Abstract We increasingly depend on a variety of data-driven algorithmic systems to assist us in many aspects life. Search engines and recommender among others are used as sources information help making all sort decisions from selecting restaurants books, choosing friends careers. This has given rise important concerns regarding the fairness such systems. In this work, we aim at presenting toolkit definitions, models methods for ensuring rankings recommendations. Our objectives threefold:...

10.1007/s00778-021-00697-y article EN cc-by The VLDB Journal 2021-10-02

Personalization of queries in database systems

OPENALEX - Publications

Georgia Koutrika Yannis Ioannidis

As information becomes available in increasing amounts to a wide spectrum of users, the need for shift towards more user-centered access paradigm arises. We develop personalization framework database systems based on user profiles and identify basic architectural modules required support it. define preference model that assigns each atomic query condition personal degree interest provide mechanism compute any complex degrees constituent ones. Preferences are stored profiles. At time,...

10.1109/icde.2004.1320030 article EN 2004-09-28

Combating spam in tagging systems

OPENALEX - Publications

Georgia Koutrika Frans Adjie Effendi Zoltán Gyöngyi Paul Heymann Héctor García-Molina

Tagging systems allow users to interactively annotate a pool of shared resources using descriptive tags. As tagging are gaining in popularity, they become more susceptible tag spam: misleading tags that generated order increase the visibility some or simply confuse users. We introduce framework for modeling and user behavior. also describe method ranking documents matching based on taggers' reliability. Using our framework, we study behavior existing approaches under malicious attacks impact...

10.1145/1244408.1244420 article EN 2007-05-08

Personalized Queries under a Generalized Preference Model

OPENALEX - Publications

Georgia Koutrika Yannis Ioannidis

Query personalization is the process of dynamically enhancing a query with related user preferences stored in profile aim providing personalized answers. The underlying idea that different users may find things relevant to search due preferences. Essential ingredients are: (a) model for representing and storing profiles, (b) algorithms generation answers using Modeling plethora preference types challenge. In this paper, we present combines expressivity concision. addition, provide efficient...

10.1109/icde.2005.106 article EN 2005-04-19

Précis: from unstructured keywords as queries to structured databases as answers

OPENALEX - Publications

Alkis Simitsis Georgia Koutrika Yannis Ioannidis

10.1007/s00778-007-0075-9 article EN The VLDB Journal 2007-11-08

Data clouds

OPENALEX - Publications

Georgia Koutrika Zahra Mohammadi Zadeh Héctor García-Molina

Keyword searches are attractive because they facilitate users searching structured databases. On the other hand, tag clouds popular for navigation and visualization purposes over unstructured data can highlight most significant concepts hidden relationships in underlying content dynamically. In this paper, we propose coupling flexibility of keyword with summarization capabilities to help access a database. We using (data clouds) summarize results guide refine their searches. The cloud...

10.1145/1516360.1516406 article EN 2009-03-24

Explaining structured queries in natural language

OPENALEX - Publications

Georgia Koutrika Alkis Simitsis Yannis Ioannidis

Many applications offer a form-based environment for nai¿ve users accessing databases without being familiar with the database schema or structured query language. User interactions are translated to queries and executed. However, as user is unlikely know underlying semantic connections among fields presented in form, it often useful provide her textual explanation of query. In this paper, we take graph-based approach translation problem. We represent various forms directed graphs annotate...

10.1109/icde.2010.5447824 article EN 2022 IEEE 38th International Conference on Data Engineering (ICDE) 2010-01-01

On the selection of tags for tag clouds

OPENALEX - Publications

Petros Venetis Georgia Koutrika Héctor García-Molina

We examine the creation of a tag cloud for exploring and understanding set objects (e.g., web pages, documents). In first part our work, we present formal system model reasoning about clouds. then metrics that capture structural properties cloud, briefly selection algorithms are used in current sites del.icio.us, Flickr, Technorati) or have been described recent work. order to evaluate results these algorithms, devise novel synthetic user model. This is specifically tailored evaluation...

10.1145/1935826.1935855 article EN 2011-02-01

Schema-agnostic vs schema-based configurations for blocking methods on homogeneous data

OPENALEX - Publications

George Papadakis George Alexiou George Papastefanatos Georgia Koutrika

Entity Resolution constitutes a core task for data integration that, due to its quadratic complexity, typically scales large datasets through blocking methods. These can be configured in two ways. The schema-based configuration relies on schema information order select signatures of high distinctiveness and low noise, while the schema-agnostic one treats every token from all attribute values as signature. latter approach has significant potential, it requires no fine-tuning by human experts...

10.14778/2856318.2856326 article EN Proceedings of the VLDB Endowment 2015-12-01

Supervised meta-blocking

OPENALEX - Publications

George Papadakis George Papastefanatos Georgia Koutrika

Entity Resolution matches mentions of the same entity. Being an expensive task for large data, its performance can be improved by blocking, i.e., grouping similar entities and comparing only in group. Blocking improves run-time Resolution, but it still involves unnecessary comparisons that limit performance. Meta-blocking is process restructuring a block collection order to prune such comparisons. Existing unsupervised meta-blocking methods use simple pruning rules, which offer rather...

10.14778/2733085.2733098 article EN Proceedings of the VLDB Endowment 2014-10-01

Combating spam in tagging systems

OPENALEX - Publications

Georgia Koutrika Frans Adjie Effendi Zolt ́n Gyöngyi Paul Heymann Héctor García-Molina

Tagging systems allow users to interactively annotate a pool of shared resources using descriptive strings called tags . Tags are used guide interesting and help them build communities that share their expertise resources. As tagging gaining in popularity, they become more susceptible tag spam : misleading generated order increase the visibility some or simply confuse users. Our goal is understand this problem better. In particular, we interested answers questions such as: How many malicious...

10.1145/1409220.1409225 article EN ACM Transactions on the Web 2008-10-01

Information seeking

OPENALEX - Publications

Héctor García-Molina Georgia Koutrika Aditya Parameswaran

How to address user information needs amidst a preponderance of data.

10.1145/2018396.2018423 article EN Communications of the ACM 2011-10-25

A Deep Dive into Deep Learning Approaches for Text-to-SQL Systems

OPENALEX - Publications

George Katsogiannis-Meimarakis Georgia Koutrika

Data is a prevalent part of every business and scientific domain,but its explosive volume increasing complexity make data querying challenging even for experts. For this reason, numerous text-to-SQL systems have been developed that enable relational databases using natural language. The recent advances on deep neural networks along with the creation two large datasets specifically made training systems, paved path novel very promising research area. purpose tutorial dive into area, covering...

10.1145/3448016.3457543 article EN Proceedings of the 2022 International Conference on Management of Data 2021-06-09

Pr&#233;cis: The Essence of a Query Answer

OPENALEX - Publications

Georgia Koutrika Alkis Simitsis Yannis Ioannidis

Wide spread use of database systems in modern society has brought the need to provide inexperienced users with ability easily search a no specific knowledge query language. Several recent research efforts have focused on supporting keyword-based searches over relational databases. This paper presents an alternative proposal and introduces idea précis queries. These are free-form queries whose answer (a précis) is synthesis results, containing not only information directly related selections...

10.1109/icde.2006.114 article EN 2006-01-01

Coming Soon ...