NFDI4DS | UHH-SEMS - Publication Details

Paolo Di Tommaso

ORCID: 0000-0003-3220-0253

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5044258205

Research Areas

Genomics and Phylogenetic Studies
Scientific Computing and Data Management
Gene expression and cancer classification
Research Data Management Practices
RNA and protein synthesis mechanisms
Algorithms and Data Compression
Machine Learning in Bioinformatics
Distributed and Parallel Computing Systems
Genetic Mapping and Diversity in Plants and Animals
Genetics and Plant Breeding
Bioinformatics and Genomic Networks
Advanced Software Engineering Methodologies
Genetics, Bioinformatics, and Biomedical Research
Advanced Data Storage Technologies
Glycosylation and Glycoproteins Research
Genetic diversity and population structure
Parallel Computing and Optimization Techniques
Model-Driven Software Engineering Techniques
Advanced Proteomics Techniques and Applications
Data Visualization and Analytics
Analog and Mixed-Signal Circuit Design
Usability and User Interface Design
Neuroscience and Neural Engineering
Artificial Intelligence in Healthcare
Mycorrhizal Fungi and Plant Interactions

Centre for Genomic Regulation
2011-2020

Barcelona Institute for Science and Technology
2017-2020

Universitat Pompeu Fabra
2010-2018

Institut thématique Génétique, génomique et bioinformatique
2015

Barcelona Biomedical Research Park
2011

Universitat Autònoma de Barcelona
2011

Universitat de Lleida
2010

Sapienza University of Rome
2004

T-Coffee: a web server for the multiple sequence alignment of protein and RNA sequences using structural information and homology extension

OPENALEX - Publications

Paolo Di Tommaso Sébastien Moretti Ioannis Xénarios Miquel Orobitg A. Montanyola and 3 more

This article introduces a new interface for T-Coffee, consistency-based multiple sequence alignment program. provides an easy and intuitive access to the most popular functionality of package. These include default T-Coffee mode protein nucleic acid sequences, M-Coffee that allows combining output any other aligners, template-based modes deliver high accuracy alignments while using structural or homology derived templates. three available template are Expresso with known 3D-Structure,...

10.1093/nar/gkr245 article EN cc-by-nc Nucleic Acids Research 2011-05-09

TCS: A New Multiple Sequence Alignment Reliability Measure to Estimate Alignment Accuracy and Improve Phylogenetic Tree Reconstruction

OPENALEX - Publications

Jia‐Ming Chang Paolo Di Tommaso Cédric Notredame

Multiple sequence alignment (MSA) is a key modeling procedure when analyzing biological sequences. Homology and evolutionary are the most common applications of MSAs. Both known to be sensitive underlying MSA accuracy. In this work, we show how problem can partly overcome using transitive consistency score (TCS), an extended version T-Coffee scoring scheme. Using local evaluation function, that one identify reliable portions MSA, as judged from BAliBASE PREFAB structure-based reference...

10.1093/molbev/msu117 article EN Molecular Biology and Evolution 2014-04-01

Accurate multiple sequence alignment of transmembrane proteins with PSI-Coffee

OPENALEX - Publications

Jia‐Ming Chang Paolo Di Tommaso Jean-François Taly Cédric Notredame

Abstract Background Transmembrane proteins (TMPs) constitute about 20~30% of all protein coding genes. The relative lack experimental structure has so far made it hard to develop specific alignment methods and the current state art (PRALINE™) only manages recapitulate 50% positions in reference alignments available from BAliBASE2-ref7. Methods We show how homology extension can be adapted combined with a consistency based approach order significantly improve multiple sequence alpha-helical...

10.1186/1471-2105-13-s4-s1 article EN cc-by BMC Bioinformatics 2012-03-28

AMPA: an automated web server for prediction of protein antimicrobial regions

OPENALEX - Publications

Marc Torrent Paolo Di Tommaso David Pulido M. Victòria Nogués Cédric Notredame and 2 more

Abstract Summary: AMPA is a web application for assessing the antimicrobial domains of proteins, with focus on design new drugs. The provides fast discovery patterns in proteins that can be used to develop peptide-based drugs against pathogens. Results are shown user-friendly graphical interface and downloaded as raw data later examination. Availability: freely available at http://tcoffee.crg.cat/apps/ampa. source code also web. Contact: marc.torrent@upf.edu; david.andreu@upf.edu...

10.1093/bioinformatics/btr604 article EN Bioinformatics 2011-11-03

The impact of Docker containers on the performance of genomic pipelines

OPENALEX - Publications

Paolo Di Tommaso Emilio Palumbo Maria Chatzou Pablo Prieto Michael Heuer and 1 more

Genomic pipelines consist of several pieces third party software and, because their experimental nature, frequent changes and updates are commonly necessary thus raising serious deployment reproducibility issues. Docker containers emerging as a possible solution for many these problems, they allow the packaging in an isolated self-contained manner. This makes it easy to distribute execute portable manner across wide range computing platforms. Thus, question that arises is what extent use...

10.7717/peerj.1273 article EN cc-by PeerJ 2015-09-24

Empowering bioinformatics communities with Nextflow and nf-core

OPENALEX - Publications

Björn E. Langer Andreia J. Amaral Marie-Odile Baudement Franziska Bonath Mathieu Charles and 38 more

Abstract Standardised analysis pipelines are an important part of FAIR bioinformatics research. Over the last decade, there has been a notable shift from point-and-click pipeline solutions such as Galaxy towards command-line Nextflow and Snakemake. We report on recent developments in nf-core frameworks that have led to widespread adoption across many scientific communities. describe how adopting standards enables faster development, improved interoperability, collaboration with >8,000...

10.1101/2024.05.10.592912 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2024-05-14

Using the T-Coffee package to build multiple sequence alignments of protein, RNA, DNA sequences and 3D structures

OPENALEX - Publications

Jean-François Taly Cedrik Magis Giovanni Bussotti Jia‐Ming Chang Paolo Di Tommaso and 4 more

10.1038/nprot.2011.393 article EN Nature Protocols 2011-10-06

T-Coffee: Tree-Based Consistency Objective Function for Alignment Evaluation

OPENALEX - Publications

Cedrik Magis Jean-François Taly Giovanni Bussotti Jia‐Ming Chang Paolo Di Tommaso and 3 more

10.1007/978-1-62703-646-7_7 article EN Methods in molecular biology 2013-08-23

PSI/TM-Coffee: a web server for fast and accurate multiple sequence alignments of regular and transmembrane proteins using homology extension on reduced databases

OPENALEX - Publications

Evan Floden Paolo Di Tommaso Maria Chatzou Cedrik Magis Cédric Notredame and 1 more

The PSI/TM-Coffee web server performs multiple sequence alignment (MSA) of proteins by combining homology extension with a consistency based approach. Homology is performed Position Specific Iterative (PSI) BLAST searches against choice redundant and non-redundant databases. main novelty this to allow databases reduced complexity rapidly perform extension. This also gives the possibility use transmembrane (TMPs) reference even faster on important category proteins. Aside from an MSA, outputs...

10.1093/nar/gkw300 article EN cc-by-nc Nucleic Acids Research 2016-04-22

TCS: a web server for multiple sequence alignment evaluation and phylogenetic reconstruction: Figure 1.

OPENALEX - Publications

Jia‐Ming Chang Paolo Di Tommaso Vincent Lefort Olivier Gascuel Cédric Notredame

This article introduces the Transitive Consistency Score (TCS) web server; a service making it possible to estimate local reliability of protein multiple sequence alignments (MSAs) using TCS index. The evaluation can be used identify aligned positions most likely contain structurally analogous residues and also support an accurate phylogenetic reconstruction. scoring scheme has been shown predictor structural alignment correctness among commonly methods. It outperform common filtering...

10.1093/nar/gkv310 article EN cc-by Nucleic Acids Research 2015-04-08

nf-core: Community curated bioinformatics pipelines

OPENALEX - Publications

Philip Ewels Alexander Peltzer Sven Fillinger Johannes Alneberg Harshil Patel and 4 more

Abstract The standardization, portability, and reproducibility of analysis pipelines is a renowned problem within the bioinformatics community. Most are designed for execution on-premise, associated software dependencies tightly coupled with local compute environment. This leads to poor pipeline portability ensuing results - both which fundamental requirements validation scientific findings. Here, we introduce nf-core : framework that provides community-driven, peer-reviewed platform...

10.1101/610741 preprint EN cc-by-nc bioRxiv (Cold Spring Harbor Laboratory) 2019-04-16

Scalable Workflows and Reproducible Data Analysis for Genomics

OPENALEX - Publications

Francesco Strozzi Roel Janssen Ricardo Wurmus Michael R. Crusoe George Githinji and 6 more

Biological, clinical, and pharmacological research now often involves analyses of genomes, transcriptomes, proteomes, interactomes, within between individuals across species. Due to large volumes, the analysis integration data generated by such high-throughput technologies have become computationally intensive, can no longer happen on a typical desktop computer.In this chapter we show how describe execute same using number workflow systems these follow different approaches tackle execution...

10.1007/978-1-4939-9074-0_24 article EN cc-by Methods in molecular biology 2019-01-01

Workflows Community Summit: Bringing the Scientific Workflows Community Together

OPENALEX - Publications

Rafael Ferreira da Silva Henri Casanova Kyle Chard Dan Laney Dong H. Ahn and 40 more

Scientific workflows have been used almost universally across scientific domains, and underpinned some of the most significant discoveries past several decades. Many these high computational, storage, and/or communication demands, thus must execute on a wide range large-scale platforms, from large clouds to upcoming exascale high-performance computing (HPC) platforms. These executions be managed using software infrastructure. Due popularity workflows, workflow management systems (WMSs)...

10.48550/arxiv.2103.09181 preprint EN cc-by-sa arXiv (Cornell University) 2021-01-01

An nf-core framework for the systematic comparison of alternative modeling tools: the multiple sequence alignment case study

OPENALEX - Publications

Luisa Santus Jose Espinosa‐Carrasco Leon Rauschning Júlia Mir-Pedrol Igor Trujnara and 11 more

The computational complexity of many key bioinformatics problems has resulted in numerous alternative heuristic solutions, where no single approach consistently outperforms all others. This creates difficulties for users trying to identify the most suitable tool their dataset and developers managing evaluating methods. As data volumes grow, deploying these methods becomes increasingly difficult, highlighting need standardized frameworks seamless deployment comparison HPC environments....

10.1101/2025.03.14.642603 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2025-03-17

Large multiple sequence alignments with a root-to-leaf regressive method

OPENALEX - Publications

Edgar Garriga Nogales Paolo Di Tommaso Cedrik Magis Ionas Erb Leila Mansouri and 5 more

10.1038/s41587-019-0333-6 article EN Nature Biotechnology 2019-12-01

Cloud-Coffee: implementation of a parallel consistency-based multiple alignment algorithm in the T-Coffee package and its benchmarking on the Amazon Elastic-Cloud

OPENALEX - Publications

Paolo Di Tommaso Miquel Orobitg Fernando Guirado Fernado Cores Antonio Espinosa and 1 more

Abstract Summary: We present the first parallel implementation of T-Coffee consistency-based multiple aligner. benchmark it on Amazon Elastic Cloud (EC2) and show that parallelization procedure is reasonably effective. also conclude for a web server with moderate usage (10K hits/month) cloud provides cost-effective alternative to in-house deployment. Availability: freeware open source package available from http://www.tcoffee.org/homepage.html Contact: cedric.notredame@crg.es

10.1093/bioinformatics/btq304 article EN Bioinformatics 2010-07-06

Generalized Bootstrap Supports for Phylogenetic Analyses of Protein Sequences Incorporating Alignment Uncertainty

OPENALEX - Publications

Maria Chatzou Evan Floden Paolo Di Tommaso Olivier Gascuel Cédric Notredame

Phylogenetic reconstructions are essential in genomics data analyses and depend on accurate multiple sequence alignment (MSA) models. We show that all currently available large-scale progressive methods numerically unstable when dealing with amino-acid sequences. They produce significantly different output changing input order. used the HOMFAM protein sequences dataset to datasets larger than 100 sequences, this instability affects average 21.5% of aligned residues. The resulting Maximum...

10.1093/sysbio/syx096 article EN Systematic Biology 2018-03-20

A novel tool for highly scalable computational pipelines

OPENALEX - Publications

Paolo Di Tommaso Maria Chatzou Pablo Prieto Baraja Cédric Notredame

Nextflow is a data-driven framework for computational pipelines that simplifies writing parallel and scalable in portable manner.

10.6084/m9.figshare.1254958.v2 article EN 2014-12-01

Incorporating alignment uncertainty into Felsenstein’s phylogenetic bootstrap to improve its reliability

OPENALEX - Publications

Jia‐Ming Chang Evan Floden Javier Herrero Olivier Gascuel Paolo Di Tommaso and 1 more

Most evolutionary analyses are based on pre-estimated multiple sequence alignment. Wong et al. established the existence of an uncertainty induced by alignment when reconstructing phylogenies. They were able to show that in many cases different aligners produce phylogenies, with no simple objective criterion sufficient distinguish among these alternatives.We demonstrate incorporating MSA into bootstrap sampling can significantly increase correlation between clade correctness and its...

10.1093/bioinformatics/btz082 article EN cc-by Bioinformatics 2019-02-05

nf-core/rnaseq: nf-core/rnaseq v3.0 - Silver Shark

OPENALEX - Publications

Harshil Patel Phil Ewels Alexander Peltzer Rickard Hammarén Olga Botvinnik and 25 more

10.5281/zenodo.4323183 article EN 2020-12-15

Workflows Community Summit 2022: A Roadmap Revolution

OPENALEX - Publications

Rafael Ferreira da Silva Rosa M. Badía Venkat Bala Debbie Bard Peer‐Timo Bremer and 95 more

Scientific workflows have become integral tools in broad scientific computing use cases. Science discovery is increasingly dependent on to orchestrate large and complex experiments that range from execution of a cloud-based data preprocessing pipeline multi-facility instrument-to-edge-to-HPC computational workflows. Given the changing landscape evolving needs emerging applications, it paramount development novel system functionalities seek increase efficiency, resilience, pervasiveness...

10.48550/arxiv.2304.00019 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Definition of visual processes in a language for expressing transitions

OPENALEX - Publications

Paolo Bottoni Maria De Marsico Paolo Di Tommaso S. Levialdi Domenico Ventriglia

10.1016/j.jvlc.2004.01.002 article EN Journal of Visual Languages & Computing 2004-05-13

Coming Soon ...