Gianmaria Silvello

ORCID: 0000-0003-4970-4554
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Semantic Web and Ontologies
  • Scientific Computing and Data Management
  • Data Quality and Management
  • Topic Modeling
  • Research Data Management Practices
  • Biomedical Text Mining and Ontologies
  • Information Retrieval and Search Behavior
  • Advanced Database Systems and Queries
  • Web Data Mining and Analysis
  • Natural Language Processing Techniques
  • Library Science and Information Systems
  • Digital and Traditional Archives Management
  • Data Visualization and Analytics
  • Digital Humanities and Scholarship
  • Advanced Text Analysis Techniques
  • Data Management and Algorithms
  • Bioinformatics and Genomic Networks
  • Advanced Data Storage Technologies
  • Genetics, Bioinformatics, and Biomedical Research
  • Image Retrieval and Classification Techniques
  • Advanced Image and Video Retrieval Techniques
  • Neural Networks and Applications
  • Data Mining Algorithms and Applications
  • Gene expression and cancer classification
  • AI in cancer detection

University of Padua
2016-2025

National Research Institute of Brewing
2021

Citations are the cornerstone of knowledge propagation and primary means assessing quality research, as well directing investments in science. Science is increasingly becoming “data‐intensive,” where large volumes data collected analyzed to discover complex patterns through simulations experiments, most scientific reference works have been replaced by online curated sets. Yet, given a set, there no quantitative, consistent, established way knowing how it has used over time, who contributed...

10.1002/asi.23917 article EN Journal of the Association for Information Science and Technology 2017-09-19

Data-driven algorithms are studied in diverse domains to support critical decisions, directly impacting people's well-being. As a result, growing community of researchers has been investigating the equity existing and proposing novel ones, advancing understanding risks opportunities automated decision-making for historically disadvantaged populations. Progress fair Machine Learning hinges on data, which can be appropriately used only if adequately documented. Unfortunately, algorithmic...

10.1007/s10618-022-00854-z article EN cc-by Data Mining and Knowledge Discovery 2022-09-17

The digitalization of clinical workflows and the increasing performance deep learning algorithms are paving way towards new methods for tackling cancer diagnosis. However, availability medical specialists to annotate digitized images free-text diagnostic reports does not scale with need large datasets required train robust computer-aided diagnosis that can target high variability cases data produced. This work proposes evaluates an approach eliminate manual annotations tools in digital...

10.1038/s41746-022-00635-4 article EN cc-by npj Digital Medicine 2022-07-22

Topic variance has a greater effect on performances than system but it cannot be controlled by developers who can only try to cope with it. On the other hand, is important its own, since what may affect directly changing components and determines differences among systems. In this paper, we face problem of studying in order better understand how much contribute overall performances. To end, propose methodology based General Linear Mixed Model (GLMM) develop statistical models able isolate...

10.1145/2911451.2911530 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2016-07-07

Context. As software systems become more integrated into society's infrastructure, the responsibility of professionals to ensure compliance with various non-functional requirements increases. These include security, safety, privacy, and, increasingly, non-discrimination. Motivation. Fairness in pricing algorithms grants equitable access basic services without discriminating on basis protected attributes. Method. We replicate a previous empirical study that used black box testing audit by...

10.48550/arxiv.2502.06439 preprint EN arXiv (Cornell University) 2025-02-10

To promote the responsible development and use of data-driven technologies –such as machine learning artificial intelligence– principles trustworthiness, accountability fairness should be followed. The quality dataset on which these applications rely, is crucial to achieve compliance with required ethical principles. Quantitative approaches measure data are abundant in literature among practitioners, however they not sufficient cover all challenges involved. In this paper, we show that...

10.1145/3726872 article EN Journal of Data and Information Quality 2025-03-29

If we want to measure the impact of a database, can use its organization treat it same way any other publishing agent, such as journal or an author?

10.1145/3704723 article EN Communications of the ACM 2025-04-16

Databases are fundamental to advance biomedical science. However, most of them populated and updated with a great deal human effort. Biomedical Relation Extraction (BioRE) aims shift this burden machines. Among its different applications, the discovery Gene-Disease Associations (GDAs) is one BioRE relevant tasks. Nevertheless, few resources have been developed train models for GDA extraction. Besides, these all limited in size-preventing from scaling effectively large amounts data.

10.1186/s12859-022-04646-6 article EN cc-by BMC Bioinformatics 2022-03-31

Information retrieval (IR) systems are the prominent means for searching and accessing huge amounts of unstructured information on web elsewhere. They complex systems, constituted by many different components interacting together, evaluation is crucial to both tune improve them. Nevertheless, in current methodology, there still no way determine how much each component contributes overall performances interact together. This hampers possibility a deep understanding IR system behavior and,...

10.1002/asi.23910 article EN Journal of the Association for Information Science and Technology 2017-11-17

This paper analyzes two state-of-the-art Neural Information Retrieval (NeuIR) models: the Deep Relevance Matching Model (DRMM) and Vector Space (NVSM). Our contributions include: (i) a reproducibility study of supervised unsupervised NeuIR models, where we present issues encountered during their reproducibility; (ii) performance comparison with other lexical, semantic showing that traditional lexical models are still highly competitive DRMM NVSM; (iii) an application NVSM on collections from...

10.1016/j.ipm.2019.102109 article EN cc-by-nc-nd Information Processing & Management 2019-09-13

The semantic mismatch between query and document terms—i.e., the gap—is a long-standing problem in Information Retrieval (IR). Two main linguistic features related to gap that can be exploited improve retrieval are synonymy polysemy. Recent works integrate knowledge from curated external resources into learning process of neural language models reduce effect gap. However, these knowledge-enhanced have been used IR mostly for re-ranking not directly retrieval. We propose Semantic-Aware Neural...

10.1145/3417996 article EN ACM transactions on office information systems 2020-09-11

Exa-scale volumes of medical data have been produced for decades. In most cases, the diagnosis is reported in free text, encoding knowledge that still largely unexploited. order to allow decoding included reports, we propose an unsupervised extraction system combining a rule-based expert with pre-trained Machine Learning (ML) models, namely Semantic Knowledge Extractor Tool (SKET). Combining techniques and ML models provides high accuracy results extraction. This work demonstrates viability...

10.1016/j.jpi.2022.100139 article EN cc-by-nc-nd Journal of Pathology Informatics 2022-01-01

In the last decade, scholarly graphs became fundamental to storing and managing knowledge in a structured machine-readable way. Methods tools for discovery impact assessment of science rely on such their quality serve scientists, policymakers, publishers. Since research data very important communication, started including dataset metadata relationships publications. Such are foundations Open Science investigations, data-article publishing workflows, discovery, indicators. However, due...

10.1145/3597310 article EN Journal of Data and Information Quality 2023-05-19

In this paper we discuss the problem of data citation with a specific focus on Linked Open Data.We outline main requirements methodology must fulfill: (i) uniquely identify cited objects; (ii) provide descriptive metadata; (iii) enable variable granularity citations; and (iv) produce both human-and machine-readable references.We propose based named graphs RDF quad semantics that allows us to create meta-graphs respecting outlined requirements.We also present compelling use case search...

10.1045/january2015-silvello article EN D-Lib Magazine 2015-01-01
Coming Soon ...