- Scientific Computing and Data Management
- Research Data Management Practices
- Semantic Web and Ontologies
- Distributed and Parallel Computing Systems
- Data Quality and Management
- Biomedical Text Mining and Ontologies
- Software Engineering Research
- Natural Language Processing Techniques
- Software System Performance and Reliability
- Service-Oriented Architecture and Web Services
- Advanced Database Systems and Queries
- Business Process Modeling and Analysis
- Advanced Graph Neural Networks
- Topic Modeling
- Advanced Data Storage Technologies
- Genetic Associations and Epidemiology
- Web Data Mining and Analysis
- Simulation Techniques and Applications
- Species Distribution and Climate Change
- Cell Image Analysis Techniques
- Wikis in Education and Collaboration
- Manufacturing Process and Optimization
- Image Processing and 3D Reconstruction
- Bioinformatics and Genomic Networks
- Genetics and Neurodevelopmental Disorders
Universidad Politécnica de Madrid
2013-2025
University of Southern California
2016-2025
Southern California University for Professional Studies
2017-2023
European Telecommunications Standards Institute
2023
University of Manchester
2022
University of Amsterdam
2022
The University of Queensland
2022
ZB MED - Information Centre for Life Sciences
2022
Vlaams Instituut voor Biotechnologie
2022
VIB-UGent Center for Plant Systems Biology
2022
An increasing number of researchers support reproducibility by including pointers to and descriptions datasets, software methods in their publications. However, scientific articles may be ambiguous, incomplete difficult process automated systems. In this paper we introduce RO-Crate, an open, community-driven, lightweight approach packaging research artefacts along with metadata a machine readable manner. RO-Crate is based on Schema$.$org annotations JSON-LD, aiming establish best practices...
Computational workflows describe the complex multi-step methods that are used for data collection, preparation, analytics, predictive modelling, and simulation lead to new products. They can inherently contribute FAIR principles: by processing according established metadata; creating metadata themselves during of data; tracking recording provenance. These properties aid quality assessment secondary usage. Moreover, digital objects in their own right. This paper argues principles need address...
How easy is it to reproduce the results found in a typical computational biology paper? Either through experience or intuition reader will already know that answer with difficulty not at all. In this paper we attempt quantify by reproducing previously published for different classes of users (ranging from little expertise domain experts) and suggest ways which situation might be improved. Quantification achieved estimating time required each steps method described original make them part an...
Scientific workflows are a popular mechanism for specifying and automating data-driven in silico experiments. A significant aspect of their value lies potential to be reused. Once shared, become useful building blocks that can combined or modified developing new However, previous studies have shown storing workflow specifications alone is not sufficient ensure they successfully reused, without being able understand what the aim achieve re-enact them. To gain an understanding workflow, how it...
Recent trends within computational and data sciences show an increasing recognition adoption of workflows as tools for productivity reproducibility that also democratize access to platforms processing know-how. As digital objects be shared, discovered, reused, benefit from the FAIR principles, which stand Findable, Accessible, Interoperable, Reusable. The Workflows Community Initiative's Working Group (WCI-FW), a global open community researchers developers working with across disciplines...
Automated Machine Learning (AutoML) systems are emerging that automatically search for possible solutions from a large space of kinds models. Although fully automated machine learning is appropriate many applications, users often have knowledge supplements and constraints the available data solutions. This paper proposes human-guided (HGML) as hybrid approach where user interacts with an AutoML system tasks it to explore different problem settings reflect user's about available. We present:...
Major societal and environmental challenges involve complex systems that have diverse multi-scale interacting processes. Consider, for example, how droughts water reserves affect crop production agriculture industrial needs quality availability. Preventive measures, such as delaying planting dates adopting new agricultural practices in response to changing weather patterns, can reduce the damage caused by natural Understanding these human processes one another allows forecasting effects of...
Recording the provenance of scientific computation results is key to support traceability, reproducibility and quality assessment data products. Several models have been explored address this need, providing representations workflow plans their executions as well means packaging resulting information for archiving sharing. However, existing approaches tend lack interoperable adoption across management systems. In work we present Workflow Run RO-Crate, an extension RO-Crate (Research Object...
In recent years, a variety of systems have been developed that export the workflows used to analyze data and make them part published articles. We argue are in current approaches dependent on specific codes for execution, workflow system used, catalogs where they published. this paper, we describe new approach addresses these shortcomings makes more reusable through: 1) use abstract complement executable when execution environment is different, 2) publication both using standards such as...
We describe the AEMET meteorological dataset, which makes available some data sources from Agencia Estatal de Meteorología (AEMET, Spanish Meteorological Office) as Linked Data. The selected for publication are generated every ten minutes by
This review summarizes the last decade of work by ENIGMA (Enhancing NeuroImaging Genetics through Meta Analysis) Consortium, a global alliance over 1,400 scientists across 43 countries, studying human brain in health and disease. Building on large-scale genetic studies that discovered first robustly replicated loci associated with metrics, has diversified into 50 working groups (WGs), pooling worldwide data expertise to answer fundamental questions neuroscience, psychiatry, neurology,...
Abstract The progress of science is tied to the standardization measurements, instruments, and data. This especially true in Big Data age, where analyzing large data volumes critically hinges on being standardized. Accordingly, lack community‐sanctioned standards paleoclimatology has largely precluded benefits advances field. Building upon recent efforts standardize format terminology paleoclimate data, this article describes Paleoclimate Community reporTing Standard (PaCTS), a crowdsourced...
RDF-star has been proposed as an extension of RDF to make statements about statements. Libraries and graph stores have started adopting RDF-star, but the generation data remains largely unexplored. To allow generating from heterogeneous data, RML-star was RML. However, no system developed so far that implements specification. In this work, we present Morph-KGCstar, which extends Morph-KGC materialization engine generate datasets. We validate Morph-KGCstar by running test cases derived...