Franco Maria Nardini

ORCID: 0000-0003-3183-334X
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Data Management and Algorithms
  • Topic Modeling
  • Advanced Image and Video Retrieval Techniques
  • Recommender Systems and Techniques
  • Information Retrieval and Search Behavior
  • Human Mobility and Location-Based Analysis
  • Web Data Mining and Analysis
  • Image Retrieval and Classification Techniques
  • Machine Learning and Data Classification
  • Machine Learning and Algorithms
  • Neural Networks and Applications
  • Text and Document Classification Technologies
  • Data Mining Algorithms and Applications
  • Data Quality and Management
  • Domain Adaptation and Few-Shot Learning
  • Natural Language Processing Techniques
  • Advanced Graph Neural Networks
  • Data Stream Mining Techniques
  • Algorithms and Data Compression
  • Advanced Database Systems and Queries
  • Speech and dialogue systems
  • Caching and Content Delivery
  • Complex Network Analysis Techniques
  • Imbalanced Data Classification Techniques
  • Explainable Artificial Intelligence (XAI)

Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo"
2016-2025

University of Pisa
2024

Institute of Scientific and Technical Information of China
2011-2023

National Research Council
2014-2023

University of Bologna
2001-2023

Lancaster University
2023

University of Sannio
2023

Università degli Studi eCampus
2023

Institute of Informatics and Telematics
2023

Software (Spain)
2023

Deep pretrained transformer networks are effective at various ranking tasks, such as question answering and ad-hoc document ranking. However, their computational expenses deem them cost-prohibitive in practice. Our proposed approach, called PreTTR (Precomputing Transformer Term Representations), considerably reduces the query-time latency of deep (up to a 42x speedup on web ranking) making these more practical use real-time scenario. Specifically, we precompute part term representations...

10.1145/3397271.3401093 preprint EN 2020-07-25

The identification of relevance with little textual context is a primary challenge in passage retrieval. We address this problem representation-based ranking approach that: (1) explicitly models the importance each term using contextualized language model; (2) performs expansion by propagating to similar terms; and (3) grounds representations lexicon, making them interpretable. Passage can be pre-computed at index time reduce query-time latency. call our EPIC (Expansion via Prediction...

10.1145/3397271.3401262 preprint EN 2020-07-25

In this paper we propose TripBuilder, a new framework for personalized touristic tour planning. We mine from Flickr the information about actual itineraries followed by multitude of different tourists, and match these on Point Interests available Wikipedia. The task planning tours is then modeled as an instance Generalized Maximum Coverage problem. Wisdom-of-the-crowds allows us to derive plans that maximize measure interest tourist given her preferences visiting time-budget. Experimental...

10.1145/2505515.2505643 article EN 2013-10-27

Learning-to-Rank models based on additive ensembles of regression trees have been proven to be very effective for scoring query results returned by large-scale Web search engines. Unfortunately, the computational cost thousands candidate documents traversing large is high. Thus, several works investigated solutions aimed at improving efficiency document exploiting advanced features modern CPUs and memory hierarchies. In this article, we present Q uick S corer , a new algorithm that adopts...

10.1145/2987380 article EN ACM transactions on office information systems 2016-12-12

Learning-to-Rank models based on additive ensembles of regression trees have proven to be very effective for ranking query results returned by Web search engines, a scenario where quality and efficiency requirements are demanding. Unfortunately, the computational cost these is high. Thus, several works already proposed solutions aiming at improving scoring process dealing with features peculiarities modern CPUs memory hierarchies. In this paper, we present QuickScorer, new algorithm that...

10.1145/2766462.2767733 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2015-08-04

In this paper we analyze the efficiency of various search results diversification methods. While efficacy approaches has been deeply investigated in past, response time and scalability issues have rarely addressed. A unified framework for studying performance feasibility result solutions is thus proposed. First define a new methodology detecting when, how, query need to be diversified. To purpose, rely on concept "query refinement" estimate probability ambiguous . Then, relying novel...

10.14778/1988776.1988781 article EN Proceedings of the VLDB Endowment 2011-04-01

Learning to Rank (LtR) is the machine learning method of choice for producing high quality document ranking functions from a ground-truth training examples. In practice, efficiency and effectiveness are intertwined concepts trading off meeting constraints typically existing in large-scale systems one most urgent issues. this paper we propose new framework, named CLEaVER, optimizing machine-learned models based on ensembles regression trees. The goal improve at scoring time without affecting...

10.1145/2911451.2914763 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2016-07-07

Learned sparse representations form an attractive class of contextual embeddings for text retrieval. That is so because they are effective models relevance and interpretable by design. Despite their apparent compatibility with inverted indexes, however, retrieval over remains challenging. due to the distributional differences between learned term frequency-based lexical such as BM25. Recognizing this challenge, a great deal research has gone into, among other things, designing algorithms...

10.1145/3626772.3657769 preprint EN arXiv (Cornell University) 2024-04-29

This monograph takes a step towards promoting the study of efficiency in era neural information retrieval by offering comprehensive survey literature on and effectiveness ranking, to limited extent, retrieval. was inspired parallels that exist between challenges network-based ranking solutions their predecessors, decision forest-based learning rank models, as well connections date has offer. We believe understanding fundamentals underpinning these algorithmic data structure for containing...

10.1561/1500000071 article EN Foundations and Trends® in Information Retrieval 2023-01-01

In this article, we tackle the problem of predicting “next” geographical position a tourist, given her history (i.e., prediction is done accordingly to tourist’s current trail) by means supervised learning techniques, namely Gradient Boosted Regression Trees and Ranking SVM. The on basis an object space represented 68-dimension feature vector specifically designed for tourism-related data. Furthermore, propose thorough comparison several methods that are considered state-of-the-art in...

10.1145/2766459 article EN ACM Transactions on Intelligent Systems and Technology 2015-10-09

Maximum inner product search (MIPS) over dense and sparse vectors have progressed independently in a bifurcated literature for decades; the latter is better known as top- \(k\) retrieval Information Retrieval. This duality exists because serve different end goals. That despite fact that they are manifestations of same mathematical problem. In this work, we ask if algorithms could be applied effectively to vectors, particularly those violate assumptions underlying methods. We study...

10.1145/3665324 article EN ACM transactions on office information systems 2024-08-19

In this paper, we tackle the problem of predicting "next" geographical position a tourist given her history (i.e., prediction is done accordingly to tourist's current trail) by means supervised learning techniques, namely Gradient Boosted Regression Trees and Ranking SVM. The on basis an object space represented 68 dimension feature vector, specifically designed for tourism related data. Furthermore, propose thorough comparison several methods that are considered state-of-the-art in...

10.1145/2505515.2505656 article EN 2013-10-27

Scoring documents with learning-to-rank (LtR) models based on large ensembles of regression trees is currently deemed one the best solutions to effectively rank query results be returned by scale Information Retrieval systems. This paper investigates opportunities given SIMD capabilities modern CPUs end efficiently evaluating ensembles. We propose V-QuickScorer (vQS), which exploits extensions vectorize document scoring, i.e., perform ensemble traversal multiple simultaneously. provide a...

10.1145/2911451.2914758 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2016-07-07

In a conversational context, user expresses her multi-faceted information need as sequence of natural-language questions, i.e., utterances. Starting from given topic, the conversation evolves through utterances and system replies. The retrieval documents relevant to utterance in is challenging due ambiguity natural language difficulty detecting possible topic shifts semantic relationships among We adopt 2019 TREC Conversational Assistant Track (CAsT) framework experiment with modular...

10.1145/3397271.3401268 preprint EN 2020-07-25

Recent studies in Learning to Rank have shown the possibility effectively distill a neural network from an ensemble of regression trees. This result leads networks become natural competitor tree-based ensembles on ranking task. Nevertheless, trees outperform models both terms efficiency and effectiveness, particularly when scoring CPU. In this paper, we propose approach for speeding up time by applying combination Distillation, Pruning Fast Matrix multiplication. We employ knowledge...

10.1109/tkde.2022.3152585 article EN publisher-specific-oa IEEE Transactions on Knowledge and Data Engineering 2022-01-01

Approximate Nearest Neighbors (ANN) search is a crucial task in several applications like recommender systems and information retrieval. Current state-of-the-art ANN libraries, although being performance-oriented, often lack modularity ease of use. This translates into them not fully suitable for easy prototyping testing research ideas, an important feature to enable. We address these limitations by introducing kANNolo, novel research-oriented library written Rust explicitly designed combine...

10.48550/arxiv.2501.06121 preprint EN arXiv (Cornell University) 2025-01-10

Learned sparse text embeddings have gained popularity due to their effectiveness in top-k retrieval and inherent interpretability. Their distributional idiosyncrasies, however, long hindered use real-world systems. That changed with the recent development of approximate algorithms that leverage properties speed up retrieval. Nonetheless, much existing literature, evaluation has been limited datasets only a few million documents such as MSMARCO. It remains unclear how these systems behave on...

10.48550/arxiv.2501.11628 preprint EN arXiv (Cornell University) 2025-01-20
Coming Soon ...