NFDI4DS | UHH-SEMS - Publication Details

Aristides Gionis

ORCID: 0000-0002-5211-112X

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5022164041

Research Areas

Complex Network Analysis Techniques
Data Management and Algorithms
Opinion Dynamics and Social Influence
Advanced Graph Neural Networks
Social Media and Politics
Data Mining Algorithms and Applications
Algorithms and Data Compression
Advanced Database Systems and Queries
Mobile Crowdsensing and Crowdsourcing
Web Data Mining and Analysis
Peer-to-Peer Network Technologies
Misinformation and Its Impacts
Graph Theory and Algorithms
Complexity and Algorithms in Graphs
Caching and Content Delivery
Rough Sets and Fuzzy Logic
Advanced Clustering Algorithms Research
Privacy-Preserving Technologies in Data
Optimization and Search Problems
Recommender Systems and Techniques
Advanced Image and Video Retrieval Techniques
Spam and Phishing Detection
Human Mobility and Location-Based Analysis
Advanced Graph Theory Research
Machine Learning and Algorithms

KTH Royal Institute of Technology
2019-2025

Aalto University
2013-2021

Helsinki Institute for Information Technology
2005-2020

National University of Singapore
2020

University of Technology
2019

University of Helsinki
2004-2017

St. Francis Hospital
2016

Taylor Wimpey (United Kingdom)
2016

Yahoo (United Kingdom)
2007-2013

Yahoo (Spain)
2007-2013

Finding high-quality content in social media

OPENALEX - Publications

Eugene Agichtein Carlos Castillo Debora Donato Aristides Gionis Gilad Mishne

The quality of user-generated content varies drastically from excellent to abuse and spam. As the availability such increases, task identifying high-quality sites based on user contributions --social media -- becomes increasingly important. Social in general exhibit a rich variety information sources: addition itself, there is wide array non-content available, as links between items explicit ratings members community. In this paper we investigate methods for exploiting community feedback...

10.1145/1341531.1341557 article EN 2008-01-01

Maintaining Stream Statistics over Sliding Windows

OPENALEX - Publications

Mayur Datar Aristides Gionis Piotr Indyk Rajeev Motwani

We consider the problem of maintaining aggregates and statistics over data streams, with respect to last N elements seen so far. refer this model as sliding window model. following basic problem: Given a stream bits, maintain count number 1's in from stream. show that, using $O(\frac{1}{\epsilon} \log^2 N)$ bits memory, we can estimate within factor $1 + \epsilon$. also give matching lower bound $\Omega(\frac{1}{\epsilon}\log^2 memory for any deterministic or randomized algorithms. extend...

10.1137/s0097539701398363 article EN SIAM Journal on Computing 2002-01-01

Clustering aggregation

OPENALEX - Publications

Aristides Gionis Heikki Mannila Panayiotis Tsaparas

We consider the following problem: given a set of clusterings, find single clustering that agrees as much possible with input clusterings. This problem, aggregation , appears naturally in various contexts. For example, categorical data is an instance problem; each attribute can be viewed rows where are grouped together if they take same value on attribute. Clustering also used metaclustering method to improve robustness by combining output multiple algorithms. Furthermore, problem...

10.1145/1217299.1217303 article EN ACM Transactions on Knowledge Discovery from Data 2007-03-01

The community-search problem and how to plan a successful cocktail party

OPENALEX - Publications

Mauro Sozio Aristides Gionis

A lot of research in graph mining has been devoted the discovery communities. Most work focused scenario where communities need to be discovered with only reference input graph. However, for many interesting applications one is interested finding community formed by a given set nodes. In this paper we study query-dependent variant community-detection problem, which call community-search problem: G, and query nodes graph, seek find subgraph G that contains it densely connected.

10.1145/1835804.1835923 article EN 2010-07-25

Social Network Analysis and Mining for Business Applications

OPENALEX - Publications

Francesco Bonchi Carlos Castillo Aristides Gionis Alejandro Jaimes

Social network analysis has gained significant attention in recent years, largely due to the success of online social networking and media-sharing sites, consequent availability a wealth data. In spite growing interest, however, there is little understanding potential business applications mining networks. While large body research on different problems methods for mining, gap between techniques developed by community their deployment real-world applications. Therefore impact these still...

10.1145/1961189.1961194 article EN ACM Transactions on Intelligent Systems and Technology 2011-04-01

Know your neighbors

OPENALEX - Publications

Carlos Castillo Debora Donato Aristides Gionis Vanessa Murdock Fabrizio Silvestri

Web spam can significantly deteriorate the quality of search engine results. Thus there is a large incentive for commercial engines to detect pages efficiently and accurately. In this paper we present detection system that combines link-based content-based features, uses topology graph by exploiting link dependencies among pages. We find linked hosts tend belong same class: either both are or non-spam. demonstrate three methods incorporating into predictions obtained our base classifier: (i)...

10.1145/1277741.1277814 article EN 2007-07-23

Efficient semi-streaming algorithms for local triangle counting in massive graphs

OPENALEX - Publications

Luca Becchetti Paolo Boldi Carlos Castillo Aristides Gionis

In this paper we study the problem of local triangle counting in large graphs. Namely, given a graph G = (V;E) want to estimate as accurately possible number triangles incident every node υ ∈ V graph. The computing global has been considered before, but our knowledge is first that addresses with focus on efficiency issues arising massive distribution and related clustering coefficient can be used many interesting applications. For example, show measures compute help detect presence spamming...

10.1145/1401890.1401898 article EN 2008-08-24

Online team formation in social networks

OPENALEX - Publications

Aris Anagnostopoulos Luca Becchetti Carlos Castillo Aristides Gionis Stefano Leonardi

We study the problem of online team formation. consider a setting in which people possess different skills and compatibility among potential members is modeled by social network. A sequence tasks arrives an fashion, each task requires specific set skills. The goal to form new upon arrival task, so that (i) possesses all required (ii) has small communication overhead, (iii) workload performing balanced fairest possible way.

10.1145/2187836.2187950 article EN 2012-04-16

Proceedings of the ACM Web Conference 2022

OPENALEX - Publications

Frédérique Laforest Raphaël Troncy Elena Simperl Deepak Agarwal Aristides Gionis and 2 more

Web data and computational models can play important roles in analyzing cultural trends.The current study presents an analysis of 23,492 sneaker images metadata collected from a global reselling shop, StockX.com.Based on encompassing 22 years 1999 to 2020, we propose design index that helps track changes the characteristics sneakers using contrastive learning method.Our suggest designs have been employing brighter colors lower hue saturation values over time.We also observe how popular...

10.1145/3485447 preprint EN 2022-04-25

The query-flow graph

OPENALEX - Publications

Paolo Boldi Francesco Bonchi Carlos Castillo Debora Donato Aristides Gionis and 1 more

Query logs record the queries and actions of users search engines, as such they contain valuable information about interests, preferences, behavior users, well their implicit feedback to engine results. Mining wealth available in query has many important applications including query-log analysis, user profiling personalization, advertising, recommendation, more.In this paper we introduce query-flow graph, a graph representation interesting knowledge latent querying behavior. Intuitively,...

10.1145/1458082.1458163 article EN 2008-10-26

Quantifying Controversy on Social Media

OPENALEX - Publications

Kiran Garimella Gianmarco De Francisci Morales Aristides Gionis Michael Mathioudakis

Which topics spark the most heated debates on social media? Identifying those is not only interesting from a societal point of view but also allows filtering and aggregation media content for disseminating news stories. In this article, we perform systematic methodological study controversy detection by using network structure media. Unlike previous work, rather than studying in single hand-picked topic domain-specific knowledge, take general approach to any domain . Our quantifying based...

10.1145/3140565 article EN ACM Transactions on Social Computing 2018-01-18

Denser than the densest subgraph

OPENALEX - Publications

Charalampos E. Tsourakakis Francesco Bonchi Aristides Gionis Francesco Gullo Maria A. Tsiarli

Finding dense subgraphs is an important graph-mining task with many applications. Given that the direct optimization of edge density not meaningful, as even a single achieves maximum density, research has focused on optimizing alternative functions. A very popular among such functions average degree, whose maximization leads to well-known densest-subgraph notion. Surprisingly enough, however, densest are typically large graphs, small and diameter.

10.1145/2487575.2487645 article EN 2013-08-11

Fast shortest path distance estimation in large networks

OPENALEX - Publications

Michalis Potamias Francesco Bonchi Carlos Castillo Aristides Gionis

In this paper we study approximate landmark-based methods for point-to-point distance estimation in very large networks. These involve selecting a subset of nodes as landmarks and computing offline the distances from each node graph to those landmarks. At runtime, when between pair is needed, it can be estimated quickly by combining precomputed distances. We prove that optimal set an NP-hard problem, thus heuristic solutions need employed. therefore explore theoretical insights devise...

10.1145/1645953.1646063 article EN 2009-11-02

Correlating financial time series with micro-blogging activity

OPENALEX - Publications

Eduardo Ruiz Vagelis Hristidis Carlos Castillo Aristides Gionis Alejandro Jaimes

We study the problem of correlating micro-blogging activity with stock-market events, defined as changes in price and traded volume stocks. Specifically, we collect messages related to a number companies, search for correlations between events those companies features extracted from messages. The extract can be categorized two groups. Features first group measure overall platform, such posts, re-posts, so on. second properties an induced interaction graph, instance, connected components,...

10.1145/2124295.2124358 article EN 2012-02-08

Political Discourse on Social Media

OPENALEX - Publications

Kiran Garimella Gianmarco De Francisci Morales Aristides Gionis Michael Mathioudakis

Echo chambers, i.e., situations where one is exposed only to opinions that agree with their own, are an increasing concern for the political discourse in many democratic countries. This paper studies phenomenon of echo chambers on social media. We identify two components phenomenon: opinion shared, and »chamber» (i.e., network) allows »echo» be re-shared -- examine closely at how these interact. define a production consumption measure social-media users, which captures leaning content shared...

10.1145/3178876.3186139 article EN 2018-01-01

k-means–: A unified approach to clustering and outlier detection

OPENALEX - Publications

Sanjay Chawla Aristides Gionis

We present a unified approach for simultaneously clustering and discovering outliers in data. Our is formalized as generalization of the k-MEANS problem. prove that problem NP-hard then practical polynomial time algorithm, which guaranteed to converge local optimum. Furthermore we extend our all distance measures can be expressed form Bregman divergence. Experiments on synthetic real datasets demonstrate effectiveness utility carrying out both outlier detection concurrent manner. In...

10.1137/1.9781611972832.21 article EN 2013-05-02

Finding interesting associations without support pruning

OPENALEX - Publications

Edith Cohen Mayur Datar S. Fujiwara Aristides Gionis Piotr Indyk and 3 more

Association-rule mining has heretofore relied on the condition of high support to do its work efficiently. In particular, well-known a priori algorithm is only effective when rules interest are relationships that occur very frequently. However, there number applications, such as data mining, identification similar Web documents, clustering, and collaborative filtering, where have comparatively few instances in data. these cases, we must look for highly correlated items, or possibly even...

10.1109/69.908981 article EN IEEE Transactions on Knowledge and Data Engineering 2001-01-01

The Discrete Basis Problem

OPENALEX - Publications

Pauli Miettinen Taneli Mielikäinen Aristides Gionis Gautam Das Heikki Mannila

Matrix decomposition methods represent a data matrix as product of two factor matrices: one containing basis vectors that meaningful concepts in the data, and another describing how observed can be expressed combinations vectors. Decomposition have been studied extensively, but many return real-valued matrices. Interpreting matrices is hard if original Boolean. In this paper, we describe formulation for Boolean Discrete Basis Problem. The problem seeks binary matrix, thus allowing user to...

10.1109/tkde.2008.53 article EN IEEE Transactions on Knowledge and Data Engineering 2008-08-27

Maintaining stream statistics over sliding windows: (extended abstract)

OPENALEX - Publications

Mayur Datar Aristides Gionis Piotr Indyk Rajeev Motwani

We consider the problem of maintaining aggregates and statistics over data streams, with respect to last N elements seen so far. refer this model as sliding window model. following basic problem: Given a stream bits, maintain count number 1's in from stream. show that using O(1/e log2N) bits memory, we can estimate within factor 1 + e. also give matching lower bound Ω(1/e memory for any deterministic or randomized algorithms. extend our scheme sum positive integers. provide upper bounds more...

10.5555/545381.545466 article EN Symposium on Discrete Algorithms 2002-01-06

k-nearest neighbors in uncertain graphs

OPENALEX - Publications

Michalis Potamias Francesco Bonchi Aristides Gionis George Kollios

Complex networks, such as biological, social, and communication often entail uncertainty, thus, can be modeled probabilistic graphs . Similar to the problem of similarity search in standard graphs, a fundamental for is efficiently answer k-nearest neighbor queries ( k -NN), which computing closest nodes some specific node. In this paper we introduce framework processing -NN graphs. We propose novel distance functions that extend well-known graph concepts, shortest paths. order compute them...

10.14778/1920841.1920967 article EN Proceedings of the VLDB Endowment 2010-09-01

Assessing data mining results via swap randomization

OPENALEX - Publications

Aristides Gionis Heikki Mannila Taneli Mielikäinen Panayiotis Tsaparas

The problem of assessing the significance data mining results on high-dimensional 0--1 datasets has been studied extensively in literature. For problems such as frequent sets and finding correlations, testing can be done by standard statistical tests chi-square, or other methods. However, depend only specific attributes not dataset a whole. Moreover, are difficult to apply patterns complex algorithms. In this article, we consider simple randomization technique that deals with shortcoming....

10.1145/1297332.1297338 article EN ACM Transactions on Knowledge Discovery from Data 2007-12-01

The impact of caching on search engines

OPENALEX - Publications

Ricardo Baeza‐Yates Aristides Gionis Flavio Junqueira Vanessa Murdock Vassilis Plachouras and 1 more

In this paper we study the trade-offs in designing efficient caching systems for Web search engines. We explore impact of different approaches, such as static vs. dynamic caching, and query results vs.caching posting lists. Using a log spanning whole year limitations demonstrate that lists can achieve higher hit rates than answers. propose new algorithm lists, which outperforms previous methods. also problem finding optimal way to split cache between answers Finally, measure how changes...

10.1145/1277741.1277775 article EN 2007-07-23

XTRACT

OPENALEX - Publications

Minos Garofalakis Aristides Gionis Rajeev Rastogi S. Seshadri Kyuseok Shim

XML is rapidly emerging as the new standard for data representation and exchange on Web. An document can be accompanied by a Document Type Descriptor (DTD) which plays role of schema an collection. DTDs contain valuable information structure documents thus have crucial in efficient storage data, well effective formulation optimization queries. In this paper, we propose XTRACT, novel system inferring DTD database documents. Since syntax incorporates full expressive power regular expressions ,...

10.1145/335191.335409 article EN ACM SIGMOD Record 2000-05-16

Coming Soon ...