Nicola Soranzo

ORCID: 0000-0003-3627-5340
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Scientific Computing and Data Management
  • Genetics, Bioinformatics, and Biomedical Research
  • Bioinformatics and Genomic Networks
  • Research Data Management Practices
  • Genomics and Phylogenetic Studies
  • Machine Learning in Bioinformatics
  • Gene Regulatory Network Analysis
  • Cancer Genomics and Diagnostics
  • Gene expression and cancer classification
  • Microbial Metabolic Engineering and Bioproduction
  • Single-cell and spatial transcriptomics
  • Protist diversity and phylogeny
  • CRISPR and Genetic Engineering
  • Distributed and Parallel Computing Systems
  • Cell Image Analysis Techniques
  • Molecular Biology Techniques and Applications
  • Genetic Mapping and Diversity in Plants and Animals
  • Advanced Data Storage Technologies
  • Microbial Community Ecology and Physiology
  • Computational Drug Discovery Methods
  • Fungal and yeast genetics research
  • Cancer-related molecular mechanisms research
  • Biomedical Text Mining and Ontologies
  • Extracellular vesicles in disease
  • Computational Physics and Python Applications

Earlham Institute
2016-2025

Norwich Research Park
2016-2024

Center for Advanced Studies Research and Development in Sardinia
2010-2014

Environment Park
2014

University of Cagliari
2014

Scuola Internazionale Superiore di Studi Avanzati
2007-2011

Virginia Tech
2011

Max Planck Institute for Informatics
2011

Istituto Universitario di Studi Superiori di Pavia
2007

Galaxy (homepage: https://galaxyproject.org, main public server: https://usegalaxy.org) is a web-based scientific analysis platform used by tens of thousands scientists across the world to analyze large biomedical datasets such as those found in genomics, proteomics, metabolomics and imaging. Started 2005, continues focus on three key challenges data-driven science: making analyses accessible all researchers, ensuring are completely reproducible, it simple communicate so that they can be...

10.1093/nar/gky379 article EN cc-by Nucleic Acids Research 2018-05-03

High-throughput data production technologies, particularly 'next-generation' DNA sequencing, have ushered in widespread and disruptive changes to biomedical research. Making sense of the large datasets produced by these technologies requires sophisticated statistical computational methods, as well substantial power. This has led an acute crisis life sciences, researchers without informatics training attempt perform computation-dependent analyses. Since 2005, Galaxy project worked address...

10.1093/nar/gkw343 article EN cc-by Nucleic Acids Research 2016-05-02
Enis Afgan Anton Nekrutenko Björn Grüning Daniel Blankenberg Jeremy Goecks and 95 more Michael C. Schatz Alexander Ostrovsky Alexandru Mahmoud Andrew Lonie Anna Syme Anne Fouilloux Anthony Bretaudeau Anton Nekrutenko Anup Kumar Arthur C. Eschenlauer Assunta D DeSanto Aysam Guerler Beatriz Serrano‐Solano Bérénice Batut Björn Grüning Bradley W. Langhorst Bridget Carr Bryan Raubenolt Cameron Hyde Catherine J. Bromhead Christopher B. Barnett Coline Royaux Cristóbal Gallardo Daniel Blankenberg Daniel Fornika Dannon Baker Dave Bouvier Dave Clements David Anderson de Lima Morais David López Tabernero Delphine Larivière Engy Nasr Enis Afgan Federico Zambelli Florian Heyl Fotis Psomopoulos Frederik Coppens Gareth Price Gianmauro Cuccuru Gildas Le Corguillé Greg Von Kuster Gulsum Gudukbay Akbulut Helena Rasche Hans-Rudolf Hotz Ignacio Eguinoa Igor V. Makunin Isuru Ranawaka James Taylor Jayadev Joshi Jennifer Hillman‐Jackson Jeremy Goecks John Chilton Kaivan Kamali Keith Suderman Krzysztof Poterlowicz Le Bras Yvan Lucille Lopez‐Delisle Luke Sargent Madeline E. Bassetti M. A. Tangaro Marius van den Beek Martin Čech Matthias Bernt Matthias Fahrner Mehmet Tekman Melanie Christine Föll Michael C. Schatz Michael R. Crusoe Miguel Roncoroni Natalie Kucher Nate Coraor Nicholas Stoler Nick Rhodes Nicola Soranzo Niko Pinter Nuwan Goonasekera Pablo Moreno Pavankumar Videm Mélanie Pétéra Pietro Mandreoli Pratik Jagtap Qiang Gu Ralf J. M. Weber Ross Lazarus Ruben H.P. Vorderman Saskia Hiltemann Sergey Golitsynskiy Shilpa Garg Simon Bray Simon Gladman Simone Leo Subina Mehta Timothy J. Griffin Vahid Jalili Yves Vandenbrouck

Galaxy is a mature, browser accessible workbench for scientific computing. It enables scientists to share, analyze and visualize their own data, with minimal technical impediments. A thriving global community continues use, maintain contribute the project, support from multiple national infrastructure providers that enable freely analysis training services. The Training Network supports free, self-directed, virtual >230 integrated tutorials. Project engagement metrics have continued grow...

10.1093/nar/gkac247 article EN cc-by Nucleic Acids Research 2022-04-14

The NCBI BLAST suite has become ubiquitous in modern molecular biology and is used for small tasks such as checking capillary sequencing results of single PCR products, genome annotation or even larger scale pan-genome analyses. For early adopters the Galaxy web-based biomedical data analysis platform, integrating into was a natural step sequence comparison workflows.The command line BLAST+ tool wrapped use within Galaxy. Appropriate datatypes were defined needed. integration goal making...

10.1186/s13742-015-0080-7 article EN cc-by GigaScience 2015-08-25

The primary problem with the explosion of biomedical datasets is not data, computational resources, and required storage space, but general lack trained skilled researchers to manipulate analyze these data. Eliminating this requires development comprehensive educational resources. Here we present a community-driven framework that enables modern, interactive teaching data analytics in life sciences facilitates training materials. key feature our system it static continuously improved...

10.1016/j.cels.2018.05.012 article EN publisher-specific-oa Cell Systems 2018-06-01

There is an ongoing explosion of scientific datasets being generated, brought on by recent technological advances in many areas the natural sciences. As a result, life sciences have become increasingly computational nature, and bioinformatics has taken central role research studies. However, basic skills, data analysis, stewardship are still rarely taught science educational programs, resulting skills gap researchers tasked with analysing these big datasets. In order to address this empower...

10.1371/journal.pcbi.1010752 article EN public-domain PLoS Computational Biology 2023-01-09

Abstract Summary: End-to-end next-generation sequencing microbiology data analysis requires a diversity of tools covering bacterial resequencing, de novo assembly, scaffolding, RNA-Seq, gene annotation and metagenomics. However, the construction computational pipelines that use different software packages is difficult owing to lack interoperability, reproducibility transparency. To overcome these limitations we present Orione, Galaxy-based framework consisting publicly available research...

10.1093/bioinformatics/btu135 article EN cc-by-nc Bioinformatics 2014-03-10

Obesity is linked to type 2 diabetes (T2D) and cardiovascular diseases; however, the underlying molecular mechanisms remain unclear. We aimed identify obesity-associated features that may contribute obesity-related diseases. Using circulating monocytes from 1,264 Multi-Ethnic Study of Atherosclerosis (MESA) participants, we quantified transcriptome epigenome. discovered alterations in a network coexpressed cholesterol metabolism genes are signature feature obesity inflammatory stress. This...

10.2337/db14-1314 article EN Diabetes 2015-07-07

Abstract Background Amplicon sequencing is an established and cost-efficient method for profiling microbiomes. However, many available tools to process this data require both bioinformatics skills high computational power big datasets. Furthermore, there are only few that allow long read amplicon analysis. To bridge gap, we developed the LotuS2 (less OTU scripts 2) pipeline, enabling user-friendly, resource friendly, versatile analysis of raw sequences. Results In LotuS2, six different...

10.1186/s40168-022-01365-1 article EN cc-by Microbiome 2022-10-19

Recent trends within computational and data sciences show an increasing recognition adoption of workflows as tools for productivity reproducibility that also democratize access to platforms processing know-how. As digital objects be shared, discovered, reused, benefit from the FAIR principles, which stand Findable, Accessible, Interoperable, Reusable. The Workflows Community Initiative's Working Group (WCI-FW), a global open community researchers developers working with across disciplines...

10.1038/s41597-025-04451-9 article EN cc-by-nc-nd Scientific Data 2025-02-24

Inferring a gene regulatory network exclusively from microarray expression profiles is difficult but important task. The aim of this work to compare the predictive power some most popular algorithms in different conditions (like data taken at equilibrium or time courses) and on both synthetic real data. We are particular interested comparing similarity measures linear type correlations partial correlations) non-linear (mutual information conditional mutual information), investigating...

10.1093/bioinformatics/btm163 article EN cc-by-nc Bioinformatics 2007-05-07

Background Reverse-engineering gene networks from expression profiles is a difficult problem for which multitude of techniques have been developed over the last decade. The yearly organized DREAM challenges allow fair evaluation and unbiased comparison these methods. Results We propose an inference algorithm that combines confidence matrices, computed as standard scores single-gene knockout data, with down-ranking feed-forward edges. Substantial improvements on predictions can be obtained...

10.1371/journal.pone.0012912 article EN cc-by PLoS ONE 2010-10-11

10.1007/978-1-0716-1307-8_20 article EN cc-by Methods in molecular biology 2021-01-01

Transcriptomic studies hold great potential towards understanding the human aging process. Previous transcriptomic have identified many genes with age-associated expression levels; however, small samples sizes and mixed cell types often make these results difficult to interpret. Using profiles in CD14+ monocytes from 1,264 participants of Multi-Ethnic Study Atherosclerosis (aged 55–94 years), we 2,704 differentially expressed chronological age (false discovery rate, FDR ≤ 0.001). We further...

10.1186/s12864-015-1522-4 article EN cc-by BMC Genomics 2015-04-21
Björn Grüning Ryan Dale Andreas Sjödin Brad Chapman Jillian Rowe and 95 more Christopher H. Tomkins-Tinch Renan Valieris Adam Caprez Bérénice Batut Mathias Haudgaard Thomas Cokelaer Kyle A. Beauchamp Brent S. Pedersen Youri Hoogstrate Anthony Bretaudeau Devon Ryan Gildas Le Corguillé Dilmurat Yusuf Sebastián Luna-Valero Rory Kirchner Karel Břinda Thomas Wollmann Martin Raden Simon J. van Heeringen Nicola Soranzo Lorena Pantano Zachary Charlop–Powers Per Unneberg Matthias De Smet Marcel Martin Greg Von Kuster Tiago Antão Milad Miladi Kevin Thornton Christian Brueffer Marius van den Beek Daniel Maticzka Clemens Blank Sebastian Will Kévin Gravouil Joachim Wolff Manuel Holtgrewe Jörg Fallmann Vitor C. Piro Ilya Shlyakhter Ayman Yousif Philip Mabon Xiao‐Ou Zhang Wei Shen Jennifer Cabral Cristel G. Thomas Eric Enns Joseph Brown Jorrit Boekel Mattias de Hollander Jerome Kelleher Nitesh Turaga Julian R. de Ruiter Dave Bouvier Simon Gladman Saket Choudhary Nicholas Harding Florian Eggenhofer Arne Kratz Zhuoqing Fang Robert Kleinkauf Henning Timm Peter Cock Enrico Seiler Colin Brislawn Thi Hong Hai Nguyen Endre Bakken Stovner Philip Ewels Matt Chambers James E. Johnson Emil Hägglund Simon Ye Roman Valls Guimerà Elmar Pruesse Walter Dunn Lance Parsons Rob Patro David Koppstein Elena Grassi Inken Wohlers Alex Reynolds MacIntosh Cornwell Nicholas Stoler Daniel Blankenberg He Guowei Marcel Bargull Alexander Junge Rick Farouni Mallory Freeberg Sourav Singh Daniel Bogema Fabio Cumbo Liang-Bo Wang David E. Larson Matthew L. Workentine

Abstract We present Bioconda ( https://bioconda.github.io ), a distribution of bioinformatics software for the lightweight, multiplatform and language-agnostic package manager Conda. Currently, offers collection over 3000 packages, which is continuously maintained, updated, extended by growing global community more than 200 contributors. improves analysis reproducibility allowing users to define isolated environments with defined versions, all are easily installed managed without...

10.1101/207092 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2017-10-21

The authors use ideas from graph theory in order to determine how distant is a given biological network being monotone. On the signed representing system, minimal number of sign inconsistencies (i.e. distance monotonicity) shown be equal fundamental cycles having negative sign. Suitable operations aiming at computing such are also proposed and outperform all algorithms that so far existing for this task. [Includes supplementary material]

10.1049/iet-syb.2009.0040 article EN IET Systems Biology 2010-05-13

There are thousands of well-maintained high-quality open-source software utilities for all aspects scientific data analysis. For more than a decade, the Galaxy Project has been providing computational infrastructure and unified user interface these tools to make them accessible wide range researchers. To streamline process integrating constructing workflows as much possible, we have developed Planemo, development kit tool workflow developers power users. Here outline Planemo's implementation...

10.1101/gr.276963.122 article EN cc-by-nc Genome Research 2023-02-01

Abstract Summary: SysGenSIM is a software package to simulate Systems Genetics (SG) experiments in model organisms, for the purpose of evaluating and comparing statistical computational methods their implementations analyses SG data [e.g. expression quantitative trait loci (eQTL) mapping network inference]. allows user select variety topologies, genetic kinetic parameters ( genotyping, gene phenotyping) with large networks thousands nodes. The encoded MATLAB, user-friendly graphical...

10.1093/bioinformatics/btr407 article EN Bioinformatics 2011-07-06

In the past years devising methods for discovering gene regulatory mechanisms at a genome-wide level has become fundamental topic in field of systems biology. The aim is to infer gene-gene interactions an increasingly sophisticated and reliable way through continuous improvement reverse engineering algorithms exploiting microarray data.This work inspired by several studies suggesting that coexpression mostly related 'static' stable binding relationships, like belonging same protein complex,...

10.1093/bioinformatics/btn220 article EN Bioinformatics 2008-05-08

The scale and diversity of available software options in the Galaxy ecosystem can make domain or community specific discovery challenging. Here, we present a semi-automated reusable pipeline for creating tailored interactive tables that list identity metadata (e.g. bio.tools, EDAM) tools microGalaxy, imaging). In addition, also describe an annotation framework to improve quality table contents, training material support reuse both by additional communities. sum these contributions is...

10.37044/osf.io/qjbxc preprint EN 2024-04-02

ERNEST Reaction Network Equilibria Study Toolbox is a MATLAB package which, by checking various different criteria on the structure of chemical reaction network, can exclude multistationarity corresponding system. The results obtained are independent rate constants reactions, and be used for model discrimination.The software, implemented in MATLAB, available under GNU GPL free software license from http://people.sissa.it/ approximately altafini/papers/SoAl09/. It requires Optimization...

10.1093/bioinformatics/btp513 article EN Bioinformatics 2009-08-27

Abstract Background The NCBI BLAST suite has become ubiquitous in modern molecular biology, used for small tasks like checking capillary sequencing results of single PCR products through to genome annotation or even larger scale pan-genome analyses. For early adopters the Galaxy web-based biomedical data analysis platform, integrating was a natural step sequence comparison workflows. Findings command line BLAST+ tool wrapped use within Galaxy, defining appropriate datatypes as needed, with...

10.1101/014043 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2015-01-21

Abstract Background The vast ecosystem of single-cell RNA-sequencing tools has until recently been plagued by an excess diverging analysis strategies, inconsistent file formats, and compatibility issues between different software suites. uptake 10x Genomics datasets begun to calm this diversity, the bioinformatics community leans once more towards large computing requirements statistically driven methods needed process understand these ever-growing datasets. Results Here we outline several...

10.1093/gigascience/giaa102 article EN cc-by GigaScience 2020-10-01

Abstract Background: Bioinformaticians routinely use multiple software tools and data sources in their day-to-day work have been guided choices by a number of cataloguing initiatives. The ELIXIR Tools Data Services Registry (bio.tools) aims to provide central information point, independent any specific scientific scope within bioinformatics or technological implementation. Meanwhile, efforts integrate workbench workflow environments accelerated enable the design, automation, reproducibility...

10.1093/gigascience/gix022 article EN cc-by GigaScience 2017-04-10
Coming Soon ...