- Semantic Web and Ontologies
- Cultural Insights and Digital Impacts
- scientometrics and bibliometrics research
- Advanced Text Analysis Techniques
- Topic Modeling
- Social Sciences and Governance
- Academic Publishing and Open Access
- Web Data Mining and Analysis
- Complex Network Analysis Techniques
- Data Management and Algorithms
- Social Media and Politics
- Academic integrity and plagiarism
- Geographic Information Systems Studies
- Research Data Management Practices
- Web visibility and informetrics
- Sentiment Analysis and Opinion Mining
- Education, sociology, and vocational training
- Scientific Computing and Data Management
- Misinformation and Its Impacts
- Data Quality and Management
- Biomedical Text Mining and Ontologies
- Linguistics and Discourse Analysis
- Information Retrieval and Search Behavior
- Expert finding and Q&A systems
- Big Data and Business Intelligence
Institut de Recherche en Informatique de Toulouse
2016-2025
Université de Toulouse
2016-2025
Université Toulouse III - Paul Sabatier
2016-2025
Université Toulouse - Jean Jaurès
2016-2025
Centre National de la Recherche Scientifique
2015-2025
Université Toulouse-I-Capitole
2016-2025
Institut Polytechnique de Bordeaux
2016-2025
Institut Universitaire de France
2017-2025
Laboratoire d'Informatique de Paris-Nord
2013-2024
Linköping University
2023
Untargeted metabolomics using liquid chromatography-mass spectrometry (LC-MS) is currently the gold-standard technique to determine full chemical diversity in biological samples. However, this approach still has many limitations; notably, difficulty of accurately estimating number unique metabolites profiled among thousands MS ion signals arising from chromatograms. Here, we describe a new workflow, MS-CleanR, based on MS-DIAL/MS-FINDER suite, which tackles feature degeneracy and improves...
Abstract In 2014 leading publishers withdrew more than 120 nonsensical publications automatically generated with the SCIgen program. Casual observations suggested that similar problematic papers are still published and sold, without follow‐up retractions. No systematic screening has been performed prevalence of such in scientific literature is unknown. Our contribution 2‐fold. First, we designed a detector combs for grammar‐based computer‐generated papers. Applied to SCIgen, it 83.6%...
Preprints promote the open and fast communication of non-peer reviewed work. Once a preprint is published in peer-reviewed venue, server updates its web page: prominent hyperlink leading to newly work added. Linking preprints publications utmost importance as it provides readers with latest version now certified Yet servers fail identify all existing preprint-publication links. This limitation calls for more thorough approach this critical information retrieval task: overlooking evidence...
Nucleotide sequence reagents underpin molecular techniques that have been applied across hundreds of thousands publications. We previously reported wrongly identified nucleotide in human research publications and described a semi-automated screening tool Seek & Blastn to fact-check their claimed status. screen >11,700 five literature corpora, including all original Gene from 2007 2018 open-access Oncology Reports 2014 2018. After manually checking outputs for >3,400 articles, we...
Abstract Reproducible laboratory research relies on correctly identified reagents. We have previously described gene papers with wrongly nucleotide sequence(s), including studying miR‐145 . Manually verifying reagent identities in 36 recent found that 56% and 17% of misidentified sequences cell lines, respectively. also 5 line identifiers 18 published elsewhere, did not represent indexed human lines. These 23 were as non‐verifiable (NV), their unclear. Studying 420 mentioned 8 NV...
Abstract We report evidence of an undocumented method to manipulate citation counts involving “sneaked” references. Sneaked references are registered as metadata for published scientific articles in which they do not appear. This manipulation exploits trusted relationships between various actors: publishers, the Crossref registration agency, digital libraries, and bibliometric platforms. By collecting from sources, we show that extra undue actually sneaked at Digital Object Identifier (DOI)...
Open mass spectral libraries (OMSLs) are critical for metabolite annotation and machine learning, especially given the rising volume of untargeted metabolomic studies development pipelines. Despite their importance, practical application OMSLs is hampered by lack standardized file formats, metadata fields, supporting ontology. Current libraries, often restricted to specific topics or matrices, such as natural products, lipids, human metabolome, may limit discovery potential studies. The goal...
We report evidence of a new set sneaked references discovered in the scientific literature. Sneaked are registered metadata publications without being listed reference section or full text actual where they ought to be found. document here 80,205 International Journal Innovative Science and Research Technology (IJISRT). These with Crossref all cite -- thus benefit this same journal. Using dataset, we evaluate three different methods automatically identify references. compare lists against...
A small amount of unscrupulous people, concerned by their career prospects, resort to paper mill services publish articles in renowned journals and conference proceedings. These include patchworks synonymized contents using paraphrasing tools, featuring tortured phrases, increasingly polluting the scientific literature. The Problematic Paper Screener (PPS) has been developed allow (re)assessment on PubPeer. Since most known phrases are found publications science, technology, engineering,...
One of the cornerstones publication integrity is thorough maintenance scientific record to ensure trustworthiness its content. This includes strict and transparent record-keeping when implementing post-publication changes through a clearly visible corrigendum or erratum, which provides details reasons for them (ICMJE 2024). However, such not always practised as stealth changes, literature without any accompanying note, have been observed. notable kind change retraction: published papers...
In this article, the authors identify disciplines that have taken an interest in masks over time, as well how, what proportions, according to concerns, with developments, and possibly effects. They ask whether multiplicity of disciplinary perspectives is likely lead emergence sharing new especially environmental ones, or balkanization juxtaposition may leave certain aspects dark thus contribute persistent production a kind ignorance. Based on bibliometric textometric study more than 6000...
Importance Retractions are rising in the scientific literature, increasing risk of reusing unreliable results. Objectives To identify reports systematic reviews that included retracted studies their meta-analyses, and to assess impact these on Design, Setting, Participants In this review meta-analysis Feet Clay Detector tool was searched all reported at least 1 including study were published 25 highest factor journals medicine, general internal, from January 2013 April 2024. All effect...
Research articles disseminate the knowledge produced by scientific community. Access to this literature is crucial for researchers and general public. Apparently, “bibliogifts” are available online free from text‐sharing platforms. However, little known about such What size of underlying digital libraries? topics covered? Where do these documents originally come from? This article reports on a study L ibrary G enesis platform ( ib en). The 25 million (42 terabytes) it hosts distributes...