Georgia Koutrika

ORCID: 0000-0002-7377-0116
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Database Systems and Queries
  • Data Management and Algorithms
  • Semantic Web and Ontologies
  • Recommender Systems and Techniques
  • Web Data Mining and Analysis
  • Data Quality and Management
  • Topic Modeling
  • Scientific Computing and Data Management
  • Mobile Crowdsensing and Crowdsourcing
  • Data Mining Algorithms and Applications
  • Big Data and Business Intelligence
  • Natural Language Processing Techniques
  • Advanced Bandit Algorithms Research
  • Expert finding and Q&A systems
  • Data Stream Mining Techniques
  • Advanced Graph Neural Networks
  • Ethics and Social Impacts of AI
  • Complex Network Analysis Techniques
  • Peer-to-Peer Network Technologies
  • Open Education and E-Learning
  • Privacy-Preserving Technologies in Data
  • Multimedia Communication and Technology
  • Information Retrieval and Search Behavior
  • Advanced Text Analysis Techniques
  • Constraint Satisfaction and Optimization

Athena Research and Innovation Center In Information Communication & Knowledge Technologies
2017-2024

Hewlett-Packard (United States)
2010-2021

Arizona State University
2016

Utah State University
2016

University of Massachusetts Amherst
2016

IBM (United States)
2012-2013

IBM Research - Almaden
2010-2012

Stanford University
2007-2011

National and Kapodistrian University of Athens
2004-2011

University of Westminster
2009

Social bookmarking is a recent phenomenon which has the potential to give us great deal of data about pages on web. One major question whether that can be used augment systems like web search. To answer this question, over past year we have gathered what believe largest dataset from social site yet analyzed by academic researchers. Our represents forty million bookmarks del.icio.us. We contribute characterization posts del.icio. us: how many exist (about 115 million), fast it growing, and...

10.1145/1341531.1341558 article EN 2008-01-01

Abstract To bridge the gap between users and data, numerous text-to-SQL systems have been developed that allow to pose natural language questions over relational databases. Recently, novel are adopting deep learning methods with very promising results. At same time, several challenges remain open making this area an active flourishing field of research development. make real progress in building systems, we need de-mystify what has done, understand how when each approach can be used, and,...

10.1007/s00778-022-00776-8 article EN cc-by The VLDB Journal 2023-01-23

In recent years, social Web sites have become important components of the Web. With their success, however, has come a growing influx spam. If left unchecked, spam threatens to undermine resource sharing, interactivity, and openness. This article surveys three categories potential countermeasures - those based on detection, demotion, prevention. Although many these been proposed before for email spam, authors find that applicability differs.

10.1109/mic.2007.125 article EN IEEE Internet Computing 2007-11-01

Entity Resolution (ER) is the problem of identifying which records in a database refer to same real-world entity. An exhaustive ER process involves computing similarities between pairs records, can be very expensive for large datasets. Various blocking techniques used enhance performance by dividing into blocks multiple ways and only comparing within block. However, most separately do not exploit results other blocks. In this paper, we propose an iterative framework where are reflected...

10.1145/1559845.1559870 article EN 2009-06-29

Recommendation systems have become very popular but most recommendation methods are `hard-wired' into the system making experimentation with and implementation of new paradigms cumbersome. In this paper, we propose FlexRecs, a framework that decouples definition process from its execution supports flexible recommendations over structured data. approach can be defined declaratively as high-level parameterized workflow comprising traditional relational operators generate or combine...

10.1145/1559845.1559923 article EN 2009-06-29

Preferences have been traditionally studied in philosophy, psychology, and economics applied to decision making problems. Recently, they attracted the attention of researchers other fields, such as databases where capture soft criteria for queries. Databases bring a whole fresh perspective study preferences, both computational representational. From representational perspective, central question is how we can effectively represent preferences incorporate them database querying. look at...

10.1145/2000824.2000829 article EN ACM Transactions on Database Systems 2011-08-01

Entity Resolution is an inherently quadratic task that typically scales to large data collections through blocking. In the context of highly heterogeneous information spaces, blocking methods rely on redundancy in order ensure high effectiveness at cost lower efficiency (i.e., more comparisons). This effect partially ameliorated by coarse-grained block processing techniques discard entire blocks either a-priori or during resolution process. this paper, we introduce meta-blocking as a generic...

10.1109/tkde.2013.54 article EN IEEE Transactions on Knowledge and Data Engineering 2013-03-27

Abstract We increasingly depend on a variety of data-driven algorithmic systems to assist us in many aspects life. Search engines and recommender among others are used as sources information help making all sort decisions from selecting restaurants books, choosing friends careers. This has given rise important concerns regarding the fairness such systems. In this work, we aim at presenting toolkit definitions, models methods for ensuring rankings recommendations. Our objectives threefold:...

10.1007/s00778-021-00697-y article EN cc-by The VLDB Journal 2021-10-02

As information becomes available in increasing amounts to a wide spectrum of users, the need for shift towards more user-centered access paradigm arises. We develop personalization framework database systems based on user profiles and identify basic architectural modules required support it. define preference model that assigns each atomic query condition personal degree interest provide mechanism compute any complex degrees constituent ones. Preferences are stored profiles. At time,...

10.1109/icde.2004.1320030 article EN 2004-09-28

Tagging systems allow users to interactively annotate a pool of shared resources using descriptive tags. As tagging are gaining in popularity, they become more susceptible tag spam: misleading tags that generated order increase the visibility some or simply confuse users. We introduce framework for modeling and user behavior. also describe method ranking documents matching based on taggers' reliability. Using our framework, we study behavior existing approaches under malicious attacks impact...

10.1145/1244408.1244420 article EN 2007-05-08

Query personalization is the process of dynamically enhancing a query with related user preferences stored in profile aim providing personalized answers. The underlying idea that different users may find things relevant to search due preferences. Essential ingredients are: (a) model for representing and storing profiles, (b) algorithms generation answers using Modeling plethora preference types challenge. In this paper, we present combines expressivity concision. addition, provide efficient...

10.1109/icde.2005.106 article EN 2005-04-19

Keyword searches are attractive because they facilitate users searching structured databases. On the other hand, tag clouds popular for navigation and visualization purposes over unstructured data can highlight most significant concepts hidden relationships in underlying content dynamically. In this paper, we propose coupling flexibility of keyword with summarization capabilities to help access a database. We using (data clouds) summarize results guide refine their searches. The cloud...

10.1145/1516360.1516406 article EN 2009-03-24

Many applications offer a form-based environment for nai¿ve users accessing databases without being familiar with the database schema or structured query language. User interactions are translated to queries and executed. However, as user is unlikely know underlying semantic connections among fields presented in form, it often useful provide her textual explanation of query. In this paper, we take graph-based approach translation problem. We represent various forms directed graphs annotate...

10.1109/icde.2010.5447824 article EN 2022 IEEE 38th International Conference on Data Engineering (ICDE) 2010-01-01

We examine the creation of a tag cloud for exploring and understanding set objects (e.g., web pages, documents). In first part our work, we present formal system model reasoning about clouds. then metrics that capture structural properties cloud, briefly selection algorithms are used in current sites del.icio.us, Flickr, Technorati) or have been described recent work. order to evaluate results these algorithms, devise novel synthetic user model. This is specifically tailored evaluation...

10.1145/1935826.1935855 article EN 2011-02-01

Entity Resolution constitutes a core task for data integration that, due to its quadratic complexity, typically scales large datasets through blocking methods. These can be configured in two ways. The schema-based configuration relies on schema information order select signatures of high distinctiveness and low noise, while the schema-agnostic one treats every token from all attribute values as signature. latter approach has significant potential, it requires no fine-tuning by human experts...

10.14778/2856318.2856326 article EN Proceedings of the VLDB Endowment 2015-12-01

Entity Resolution matches mentions of the same entity. Being an expensive task for large data, its performance can be improved by blocking, i.e., grouping similar entities and comparing only in group. Blocking improves run-time Resolution, but it still involves unnecessary comparisons that limit performance. Meta-blocking is process restructuring a block collection order to prune such comparisons. Existing unsupervised meta-blocking methods use simple pruning rules, which offer rather...

10.14778/2733085.2733098 article EN Proceedings of the VLDB Endowment 2014-10-01

Tagging systems allow users to interactively annotate a pool of shared resources using descriptive strings called tags . Tags are used guide interesting and help them build communities that share their expertise resources. As tagging gaining in popularity, they become more susceptible tag spam : misleading generated order increase the visibility some or simply confuse users. Our goal is understand this problem better. In particular, we interested answers questions such as: How many malicious...

10.1145/1409220.1409225 article EN ACM Transactions on the Web 2008-10-01

How to address user information needs amidst a preponderance of data.

10.1145/2018396.2018423 article EN Communications of the ACM 2011-10-25

Data is a prevalent part of every business and scientific domain,but its explosive volume increasing complexity make data querying challenging even for experts. For this reason, numerous text-to-SQL systems have been developed that enable relational databases using natural language. The recent advances on deep neural networks along with the creation two large datasets specifically made training systems, paved path novel very promising research area. purpose tutorial dive into area, covering...

10.1145/3448016.3457543 article EN Proceedings of the 2022 International Conference on Management of Data 2021-06-09

Wide spread use of database systems in modern society has brought the need to provide inexperienced users with ability easily search a no specific knowledge query language. Several recent research efforts have focused on supporting keyword-based searches over relational databases. This paper presents an alternative proposal and introduces idea précis queries. These are free-form queries whose answer (a précis) is synthesis results, containing not only information directly related selections...

10.1109/icde.2006.114 article EN 2006-01-01
Coming Soon ...