NFDI4DS | UHH-SEMS - Publication Details

Gianmaria Silvello

ORCID: 0000-0003-4970-4554

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5078254809

Research Areas

Semantic Web and Ontologies
Scientific Computing and Data Management
Data Quality and Management
Topic Modeling
Research Data Management Practices
Biomedical Text Mining and Ontologies
Information Retrieval and Search Behavior
Advanced Database Systems and Queries
Web Data Mining and Analysis
Natural Language Processing Techniques
Library Science and Information Systems
Digital and Traditional Archives Management
Data Visualization and Analytics
Digital Humanities and Scholarship
Advanced Text Analysis Techniques
Data Management and Algorithms
Bioinformatics and Genomic Networks
Advanced Data Storage Technologies
Genetics, Bioinformatics, and Biomedical Research
Image Retrieval and Classification Techniques
Advanced Image and Video Retrieval Techniques
Neural Networks and Applications
Data Mining Algorithms and Applications
Gene expression and cancer classification
AI in cancer detection

University of Padua
2016-2025

National Research Institute of Brewing
2021

Theory and practice of data citation

OPENALEX - Publications

Gianmaria Silvello

Citations are the cornerstone of knowledge propagation and primary means assessing quality research, as well directing investments in science. Science is increasingly becoming “data‐intensive,” where large volumes data collected analyzed to discover complex patterns through simulations experiments, most scientific reference works have been replaced by online curated sets. Yet, given a set, there no quantitative, consistent, established way knowing how it has used over time, who contributed...

10.1002/asi.23917 article EN Journal of the Association for Information Science and Technology 2017-09-19

Algorithmic fairness datasets: the story so far

OPENALEX - Publications

Alessandro Fabris S. Messina Gianmaria Silvello Gian Antonio Susto

Data-driven algorithms are studied in diverse domains to support critical decisions, directly impacting people's well-being. As a result, growing community of researchers has been investigating the equity existing and proposing novel ones, advancing understanding risks opportunities automated decision-making for historically disadvantaged populations. Progress fair Machine Learning hinges on data, which can be appropriately used only if adequately documented. Unfortunately, algorithmic...

10.1007/s10618-022-00854-z article EN cc-by Data Mining and Knowledge Discovery 2022-09-17

Unleashing the potential of digital pathology data by training computer-aided diagnosis models without human annotations

OPENALEX - Publications

Niccolò Marini Stefano Marchesin Sebastian Otálora Marek Wodziński Alessandro Caputo and 15 more

The digitalization of clinical workflows and the increasing performance deep learning algorithms are paving way towards new methods for tackling cancer diagnosis. However, availability medical specialists to annotate digitized images free-text diagnostic reports does not scale with need large datasets required train robust computer-aided diagnosis that can target high variability cases data produced. This work proposes evaluates an approach eliminate manual annotations tools in digital...

10.1038/s41746-022-00635-4 article EN cc-by npj Digital Medicine 2022-07-22

The ESW of Wikidata: Exploratory search workflows on Knowledge Graphs

OPENALEX - Publications

Matteo Lissandrini G. Prando Gianmaria Silvello

10.1016/j.websem.2024.100860 article EN cc-by Journal of Web Semantics 2025-01-05

Gender stereotype reinforcement: Measuring the gender bias conveyed by ranking algorithms

OPENALEX - Publications

Alessandro Fabris Alberto Purpura Gianmaria Silvello Gian Antonio Susto

10.1016/j.ipm.2020.102377 article EN Information Processing & Management 2020-09-03

Semantic representation and enrichment of information retrieval experimental data

OPENALEX - Publications

Gianmaria Silvello Georgeta Bordea Nicola Ferro Paul Buitelaar Toine Bogers

10.1007/s00799-016-0172-8 article EN International Journal on Digital Libraries 2016-05-28

A General Linear Mixed Models Approach to Study System Component Effects

OPENALEX - Publications

Nicola Ferro Gianmaria Silvello

Topic variance has a greater effect on performances than system but it cannot be controlled by developers who can only try to cope with it. On the other hand, is important its own, since what may affect directly changing components and determines differences among systems. In this paper, we face problem of studying in order better understand how much contribute overall performances. To end, propose methodology based General Linear Mixed Model (GLMM) develop statistical models able isolate...

10.1145/2911451.2911530 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2016-07-07

3.5K runs, 5K topics, 3M assessments and 70M measures: What trends in 10 years of Adhoc-ish CLEF?

OPENALEX - Publications

Nicola Ferro Gianmaria Silvello

10.1016/j.ipm.2016.08.001 article EN Information Processing & Management 2016-08-15

Testing software for non-discrimination: an updated and extended audit in the Italian car insurance domain

OPENALEX - Publications

Marco Rondina Antonio Vetrò Riccardo Coppola Oumaima Regragrui Alessandro Fabris and 3 more

Context. As software systems become more integrated into society's infrastructure, the responsibility of professionals to ensure compliance with various non-functional requirements increases. These include security, safety, privacy, and, increasingly, non-discrimination. Motivation. Fairness in pricing algorithms grants equitable access basic services without discriminating on basis protected attributes. Method. We replicate a previous empirical study that used black box testing audit by...

10.48550/arxiv.2502.06439 preprint EN arXiv (Cornell University) 2025-02-10

Experience: Bridging Data Measurement and Ethical Challenges with Extended Data Briefs

OPENALEX - Publications

Marco Rondina Antonio Vetrò Alessandro Fabris Gianmaria Silvello Gian Antonio Susto and 2 more

To promote the responsible development and use of data-driven technologies –such as machine learning artificial intelligence– principles trustworthiness, accountability fairness should be followed. The quality dataset on which these applications rely, is crucial to achieve compliance with required ethical principles. Quantitative approaches measure data are abundant in literature among practitioners, however they not sufficient cover all challenges involved. In this paper, we show that...

10.1145/3726872 article EN Journal of Data and Information Quality 2025-03-29

Can We Measure the Impact of a Database?

OPENALEX - Publications

Peter Buneman Dennis Dosso Matteo Lissandrini Gianmaria Silvello He Sun

If we want to measure the impact of a database, can use its organization treat it same way any other publishing agent, such as journal or an author?

10.1145/3704723 article EN Communications of the ACM 2025-04-16

Digital library interoperability at high level of abstraction

OPENALEX - Publications

Maristella Agosti Nicola Ferro Gianmaria Silvello

10.1016/j.future.2015.09.020 article EN Future Generation Computer Systems 2015-10-09

TBGA: a large-scale Gene-Disease Association dataset for Biomedical Relation Extraction

OPENALEX - Publications

Stefano Marchesin Gianmaria Silvello

Databases are fundamental to advance biomedical science. However, most of them populated and updated with a great deal human effort. Biomedical Relation Extraction (BioRE) aims shift this burden machines. Among its different applications, the discovery Gene-Disease Associations (GDAs) is one BioRE relevant tasks. Nevertheless, few resources have been developed train models for GDA extraction. Besides, these all limited in size-preventing from scaling effectively large amounts data.

10.1186/s12859-022-04646-6 article EN cc-by BMC Bioinformatics 2022-03-31

Toward an anatomy of IR system component performances

OPENALEX - Publications

Nicola Ferro Gianmaria Silvello

Information retrieval (IR) systems are the prominent means for searching and accessing huge amounts of unstructured information on web elsewhere. They complex systems, constituted by many different components interacting together, evaluation is crucial to both tune improve them. Nevertheless, in current methodology, there still no way determine how much each component contributes overall performances interact together. This hampers possibility a deep understanding IR system behavior and,...

10.1002/asi.23910 article EN Journal of the Association for Information Science and Technology 2017-11-17

Focal elements of neural information retrieval models. An outlook through a reproducibility study

OPENALEX - Publications

Stefano Marchesin Alberto Purpura Gianmaria Silvello

This paper analyzes two state-of-the-art Neural Information Retrieval (NeuIR) models: the Deep Relevance Matching Model (DRMM) and Vector Space (NVSM). Our contributions include: (i) a reproducibility study of supervised unsupervised NeuIR models, where we present issues encountered during their reproducibility; (ii) performance comparison with other lexical, semantic showing that traditional lexical models are still highly competitive DRMM NVSM; (iii) an application NVSM on collections from...

10.1016/j.ipm.2019.102109 article EN cc-by-nc-nd Information Processing & Management 2019-09-13

Learning Unsupervised Knowledge-Enhanced Representations to Reduce the Semantic Gap in Information Retrieval

OPENALEX - Publications

Maristella Agosti Stefano Marchesin Gianmaria Silvello

The semantic mismatch between query and document terms—i.e., the gap—is a long-standing problem in Information Retrieval (IR). Two main linguistic features related to gap that can be exploited improve retrieval are synonymy polysemy. Recent works integrate knowledge from curated external resources into learning process of neural language models reduce effect gap. However, these knowledge-enhanced have been used IR mostly for re-ranking not directly retrieval. We propose Semantic-Aware Neural...

10.1145/3417996 article EN ACM transactions on office information systems 2020-09-11

Empowering digital pathology applications through explainable knowledge extraction tools

OPENALEX - Publications

Stefano Marchesin Fabio Giachelle Niccolò Marini Manfredo Atzori Svetla Boytcheva and 9 more

Exa-scale volumes of medical data have been produced for decades. In most cases, the diagnosis is reported in free text, encoding knowledge that still largely unexploited. order to allow decoding included reports, we propose an unsupervised extraction system combining a rule-based expert with pre-trained Machine Learning (ML) models, namely Semantic Knowledge Extractor Tool (SKET). Combining techniques and ML models provides high accuracy results extraction. This work demonstrates viability...

10.1016/j.jpi.2022.100139 article EN cc-by-nc-nd Journal of Pathology Informatics 2022-01-01

A Novel Curated Scholarly Graph Connecting Textual and Data Publications

OPENALEX - Publications

Ornella Irrera Andrea Mannocci Paolo Manghi Gianmaria Silvello

In the last decade, scholarly graphs became fundamental to storing and managing knowledge in a structured machine-readable way. Methods tools for discovery impact assessment of science rely on such their quality serve scientists, policymakers, publishers. Since research data very important communication, started including dataset metadata relationships publications. Such are foundations Open Science investigations, data-article publishing workflows, discovery, indicators. However, due...

10.1145/3597310 article EN Journal of Data and Information Quality 2023-05-19

A Methodology for Citing Linked Open Data Subsets

OPENALEX - Publications

Gianmaria Silvello

In this paper we discuss the problem of data citation with a specific focus on Linked Open Data.We outline main requirements methodology must fulfill: (i) uniquely identify cited objects; (ii) provide descriptive metadata; (iii) enable variable granularity citations; and (iv) produce both human-and machine-readable references.We propose based named graphs RDF quad semantics that allows us to create meta-graphs respecting outlined requirements.We also present compelling use case search...

10.1045/january2015-silvello article EN D-Lib Magazine 2015-01-01

VIRTUE: A visual tool for information retrieval performance evaluation and failure analysis

OPENALEX - Publications

Marco Angelini Nicola Ferro Giuseppe Santucci Gianmaria Silvello

10.1016/j.jvlc.2013.12.003 article EN Journal of Visual Languages & Computing 2014-01-06

Coming Soon ...