- Scientific Computing and Data Management
- Genetics, Bioinformatics, and Biomedical Research
- Bioinformatics and Genomic Networks
- Research Data Management Practices
- Genomics and Phylogenetic Studies
- Machine Learning in Bioinformatics
- Gene Regulatory Network Analysis
- Cancer Genomics and Diagnostics
- Gene expression and cancer classification
- Microbial Metabolic Engineering and Bioproduction
- Single-cell and spatial transcriptomics
- Protist diversity and phylogeny
- CRISPR and Genetic Engineering
- Distributed and Parallel Computing Systems
- Cell Image Analysis Techniques
- Molecular Biology Techniques and Applications
- Genetic Mapping and Diversity in Plants and Animals
- Advanced Data Storage Technologies
- Microbial Community Ecology and Physiology
- Computational Drug Discovery Methods
- Fungal and yeast genetics research
- Cancer-related molecular mechanisms research
- Biomedical Text Mining and Ontologies
- Extracellular vesicles in disease
- Computational Physics and Python Applications
Earlham Institute
2016-2025
Norwich Research Park
2016-2024
Center for Advanced Studies Research and Development in Sardinia
2010-2014
Environment Park
2014
University of Cagliari
2014
Scuola Internazionale Superiore di Studi Avanzati
2007-2011
Virginia Tech
2011
Max Planck Institute for Informatics
2011
Istituto Universitario di Studi Superiori di Pavia
2007
Galaxy (homepage: https://galaxyproject.org, main public server: https://usegalaxy.org) is a web-based scientific analysis platform used by tens of thousands scientists across the world to analyze large biomedical datasets such as those found in genomics, proteomics, metabolomics and imaging. Started 2005, continues focus on three key challenges data-driven science: making analyses accessible all researchers, ensuring are completely reproducible, it simple communicate so that they can be...
High-throughput data production technologies, particularly 'next-generation' DNA sequencing, have ushered in widespread and disruptive changes to biomedical research. Making sense of the large datasets produced by these technologies requires sophisticated statistical computational methods, as well substantial power. This has led an acute crisis life sciences, researchers without informatics training attempt perform computation-dependent analyses. Since 2005, Galaxy project worked address...
Galaxy is a mature, browser accessible workbench for scientific computing. It enables scientists to share, analyze and visualize their own data, with minimal technical impediments. A thriving global community continues use, maintain contribute the project, support from multiple national infrastructure providers that enable freely analysis training services. The Training Network supports free, self-directed, virtual >230 integrated tutorials. Project engagement metrics have continued grow...
The NCBI BLAST suite has become ubiquitous in modern molecular biology and is used for small tasks such as checking capillary sequencing results of single PCR products, genome annotation or even larger scale pan-genome analyses. For early adopters the Galaxy web-based biomedical data analysis platform, integrating into was a natural step sequence comparison workflows.The command line BLAST+ tool wrapped use within Galaxy. Appropriate datatypes were defined needed. integration goal making...
The primary problem with the explosion of biomedical datasets is not data, computational resources, and required storage space, but general lack trained skilled researchers to manipulate analyze these data. Eliminating this requires development comprehensive educational resources. Here we present a community-driven framework that enables modern, interactive teaching data analytics in life sciences facilitates training materials. key feature our system it static continuously improved...
There is an ongoing explosion of scientific datasets being generated, brought on by recent technological advances in many areas the natural sciences. As a result, life sciences have become increasingly computational nature, and bioinformatics has taken central role research studies. However, basic skills, data analysis, stewardship are still rarely taught science educational programs, resulting skills gap researchers tasked with analysing these big datasets. In order to address this empower...
Abstract Summary: End-to-end next-generation sequencing microbiology data analysis requires a diversity of tools covering bacterial resequencing, de novo assembly, scaffolding, RNA-Seq, gene annotation and metagenomics. However, the construction computational pipelines that use different software packages is difficult owing to lack interoperability, reproducibility transparency. To overcome these limitations we present Orione, Galaxy-based framework consisting publicly available research...
Obesity is linked to type 2 diabetes (T2D) and cardiovascular diseases; however, the underlying molecular mechanisms remain unclear. We aimed identify obesity-associated features that may contribute obesity-related diseases. Using circulating monocytes from 1,264 Multi-Ethnic Study of Atherosclerosis (MESA) participants, we quantified transcriptome epigenome. discovered alterations in a network coexpressed cholesterol metabolism genes are signature feature obesity inflammatory stress. This...
Abstract Background Amplicon sequencing is an established and cost-efficient method for profiling microbiomes. However, many available tools to process this data require both bioinformatics skills high computational power big datasets. Furthermore, there are only few that allow long read amplicon analysis. To bridge gap, we developed the LotuS2 (less OTU scripts 2) pipeline, enabling user-friendly, resource friendly, versatile analysis of raw sequences. Results In LotuS2, six different...
Recent trends within computational and data sciences show an increasing recognition adoption of workflows as tools for productivity reproducibility that also democratize access to platforms processing know-how. As digital objects be shared, discovered, reused, benefit from the FAIR principles, which stand Findable, Accessible, Interoperable, Reusable. The Workflows Community Initiative's Working Group (WCI-FW), a global open community researchers developers working with across disciplines...
Inferring a gene regulatory network exclusively from microarray expression profiles is difficult but important task. The aim of this work to compare the predictive power some most popular algorithms in different conditions (like data taken at equilibrium or time courses) and on both synthetic real data. We are particular interested comparing similarity measures linear type correlations partial correlations) non-linear (mutual information conditional mutual information), investigating...
Background Reverse-engineering gene networks from expression profiles is a difficult problem for which multitude of techniques have been developed over the last decade. The yearly organized DREAM challenges allow fair evaluation and unbiased comparison these methods. Results We propose an inference algorithm that combines confidence matrices, computed as standard scores single-gene knockout data, with down-ranking feed-forward edges. Substantial improvements on predictions can be obtained...
Transcriptomic studies hold great potential towards understanding the human aging process. Previous transcriptomic have identified many genes with age-associated expression levels; however, small samples sizes and mixed cell types often make these results difficult to interpret. Using profiles in CD14+ monocytes from 1,264 participants of Multi-Ethnic Study Atherosclerosis (aged 55–94 years), we 2,704 differentially expressed chronological age (false discovery rate, FDR ≤ 0.001). We further...
Abstract We present Bioconda ( https://bioconda.github.io ), a distribution of bioinformatics software for the lightweight, multiplatform and language-agnostic package manager Conda. Currently, offers collection over 3000 packages, which is continuously maintained, updated, extended by growing global community more than 200 contributors. improves analysis reproducibility allowing users to define isolated environments with defined versions, all are easily installed managed without...
The authors use ideas from graph theory in order to determine how distant is a given biological network being monotone. On the signed representing system, minimal number of sign inconsistencies (i.e. distance monotonicity) shown be equal fundamental cycles having negative sign. Suitable operations aiming at computing such are also proposed and outperform all algorithms that so far existing for this task. [Includes supplementary material]
There are thousands of well-maintained high-quality open-source software utilities for all aspects scientific data analysis. For more than a decade, the Galaxy Project has been providing computational infrastructure and unified user interface these tools to make them accessible wide range researchers. To streamline process integrating constructing workflows as much possible, we have developed Planemo, development kit tool workflow developers power users. Here outline Planemo's implementation...
Abstract Summary: SysGenSIM is a software package to simulate Systems Genetics (SG) experiments in model organisms, for the purpose of evaluating and comparing statistical computational methods their implementations analyses SG data [e.g. expression quantitative trait loci (eQTL) mapping network inference]. allows user select variety topologies, genetic kinetic parameters ( genotyping, gene phenotyping) with large networks thousands nodes. The encoded MATLAB, user-friendly graphical...
In the past years devising methods for discovering gene regulatory mechanisms at a genome-wide level has become fundamental topic in field of systems biology. The aim is to infer gene-gene interactions an increasingly sophisticated and reliable way through continuous improvement reverse engineering algorithms exploiting microarray data.This work inspired by several studies suggesting that coexpression mostly related 'static' stable binding relationships, like belonging same protein complex,...
The scale and diversity of available software options in the Galaxy ecosystem can make domain or community specific discovery challenging. Here, we present a semi-automated reusable pipeline for creating tailored interactive tables that list identity metadata (e.g. bio.tools, EDAM) tools microGalaxy, imaging). In addition, also describe an annotation framework to improve quality table contents, training material support reuse both by additional communities. sum these contributions is...
ERNEST Reaction Network Equilibria Study Toolbox is a MATLAB package which, by checking various different criteria on the structure of chemical reaction network, can exclude multistationarity corresponding system. The results obtained are independent rate constants reactions, and be used for model discrimination.The software, implemented in MATLAB, available under GNU GPL free software license from http://people.sissa.it/ approximately altafini/papers/SoAl09/. It requires Optimization...
Abstract Background The NCBI BLAST suite has become ubiquitous in modern molecular biology, used for small tasks like checking capillary sequencing results of single PCR products through to genome annotation or even larger scale pan-genome analyses. For early adopters the Galaxy web-based biomedical data analysis platform, integrating was a natural step sequence comparison workflows. Findings command line BLAST+ tool wrapped use within Galaxy, defining appropriate datatypes as needed, with...
Abstract Background The vast ecosystem of single-cell RNA-sequencing tools has until recently been plagued by an excess diverging analysis strategies, inconsistent file formats, and compatibility issues between different software suites. uptake 10x Genomics datasets begun to calm this diversity, the bioinformatics community leans once more towards large computing requirements statistically driven methods needed process understand these ever-growing datasets. Results Here we outline several...
Abstract Background: Bioinformaticians routinely use multiple software tools and data sources in their day-to-day work have been guided choices by a number of cataloguing initiatives. The ELIXIR Tools Data Services Registry (bio.tools) aims to provide central information point, independent any specific scientific scope within bioinformatics or technological implementation. Meanwhile, efforts integrate workbench workflow environments accelerated enable the design, automation, reproducibility...