Paolo Di Tommaso

ORCID: 0000-0003-3220-0253
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Genomics and Phylogenetic Studies
  • Scientific Computing and Data Management
  • Gene expression and cancer classification
  • Research Data Management Practices
  • RNA and protein synthesis mechanisms
  • Algorithms and Data Compression
  • Machine Learning in Bioinformatics
  • Distributed and Parallel Computing Systems
  • Genetic Mapping and Diversity in Plants and Animals
  • Genetics and Plant Breeding
  • Bioinformatics and Genomic Networks
  • Advanced Software Engineering Methodologies
  • Genetics, Bioinformatics, and Biomedical Research
  • Advanced Data Storage Technologies
  • Glycosylation and Glycoproteins Research
  • Genetic diversity and population structure
  • Parallel Computing and Optimization Techniques
  • Model-Driven Software Engineering Techniques
  • Advanced Proteomics Techniques and Applications
  • Data Visualization and Analytics
  • Analog and Mixed-Signal Circuit Design
  • Usability and User Interface Design
  • Neuroscience and Neural Engineering
  • Artificial Intelligence in Healthcare
  • Mycorrhizal Fungi and Plant Interactions

Centre for Genomic Regulation
2011-2020

Barcelona Institute for Science and Technology
2017-2020

Universitat Pompeu Fabra
2010-2018

Institut thématique Génétique, génomique et bioinformatique
2015

Barcelona Biomedical Research Park
2011

Universitat Autònoma de Barcelona
2011

Universitat de Lleida
2010

Sapienza University of Rome
2004

This article introduces a new interface for T-Coffee, consistency-based multiple sequence alignment program. provides an easy and intuitive access to the most popular functionality of package. These include default T-Coffee mode protein nucleic acid sequences, M-Coffee that allows combining output any other aligners, template-based modes deliver high accuracy alignments while using structural or homology derived templates. three available template are Expresso with known 3D-Structure,...

10.1093/nar/gkr245 article EN cc-by-nc Nucleic Acids Research 2011-05-09

Multiple sequence alignment (MSA) is a key modeling procedure when analyzing biological sequences. Homology and evolutionary are the most common applications of MSAs. Both known to be sensitive underlying MSA accuracy. In this work, we show how problem can partly overcome using transitive consistency score (TCS), an extended version T-Coffee scoring scheme. Using local evaluation function, that one identify reliable portions MSA, as judged from BAliBASE PREFAB structure-based reference...

10.1093/molbev/msu117 article EN Molecular Biology and Evolution 2014-04-01

Abstract Background Transmembrane proteins (TMPs) constitute about 20~30% of all protein coding genes. The relative lack experimental structure has so far made it hard to develop specific alignment methods and the current state art (PRALINE™) only manages recapitulate 50% positions in reference alignments available from BAliBASE2-ref7. Methods We show how homology extension can be adapted combined with a consistency based approach order significantly improve multiple sequence alpha-helical...

10.1186/1471-2105-13-s4-s1 article EN cc-by BMC Bioinformatics 2012-03-28

Abstract Summary: AMPA is a web application for assessing the antimicrobial domains of proteins, with focus on design new drugs. The provides fast discovery patterns in proteins that can be used to develop peptide-based drugs against pathogens. Results are shown user-friendly graphical interface and downloaded as raw data later examination. Availability: freely available at http://tcoffee.crg.cat/apps/ampa. source code also web. Contact: marc.torrent@upf.edu; david.andreu@upf.edu...

10.1093/bioinformatics/btr604 article EN Bioinformatics 2011-11-03

Genomic pipelines consist of several pieces third party software and, because their experimental nature, frequent changes and updates are commonly necessary thus raising serious deployment reproducibility issues. Docker containers emerging as a possible solution for many these problems, they allow the packaging in an isolated self-contained manner. This makes it easy to distribute execute portable manner across wide range computing platforms. Thus, question that arises is what extent use...

10.7717/peerj.1273 article EN cc-by PeerJ 2015-09-24

Abstract Standardised analysis pipelines are an important part of FAIR bioinformatics research. Over the last decade, there has been a notable shift from point-and-click pipeline solutions such as Galaxy towards command-line Nextflow and Snakemake. We report on recent developments in nf-core frameworks that have led to widespread adoption across many scientific communities. describe how adopting standards enables faster development, improved interoperability, collaboration with >8,000...

10.1101/2024.05.10.592912 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2024-05-14

The PSI/TM-Coffee web server performs multiple sequence alignment (MSA) of proteins by combining homology extension with a consistency based approach. Homology is performed Position Specific Iterative (PSI) BLAST searches against choice redundant and non-redundant databases. main novelty this to allow databases reduced complexity rapidly perform extension. This also gives the possibility use transmembrane (TMPs) reference even faster on important category proteins. Aside from an MSA, outputs...

10.1093/nar/gkw300 article EN cc-by-nc Nucleic Acids Research 2016-04-22

This article introduces the Transitive Consistency Score (TCS) web server; a service making it possible to estimate local reliability of protein multiple sequence alignments (MSAs) using TCS index. The evaluation can be used identify aligned positions most likely contain structurally analogous residues and also support an accurate phylogenetic reconstruction. scoring scheme has been shown predictor structural alignment correctness among commonly methods. It outperform common filtering...

10.1093/nar/gkv310 article EN cc-by Nucleic Acids Research 2015-04-08

Abstract The standardization, portability, and reproducibility of analysis pipelines is a renowned problem within the bioinformatics community. Most are designed for execution on-premise, associated software dependencies tightly coupled with local compute environment. This leads to poor pipeline portability ensuing results - both which fundamental requirements validation scientific findings. Here, we introduce nf-core : framework that provides community-driven, peer-reviewed platform...

10.1101/610741 preprint EN cc-by-nc bioRxiv (Cold Spring Harbor Laboratory) 2019-04-16

Biological, clinical, and pharmacological research now often involves analyses of genomes, transcriptomes, proteomes, interactomes, within between individuals across species. Due to large volumes, the analysis integration data generated by such high-throughput technologies have become computationally intensive, can no longer happen on a typical desktop computer.In this chapter we show how describe execute same using number workflow systems these follow different approaches tackle execution...

10.1007/978-1-4939-9074-0_24 article EN cc-by Methods in molecular biology 2019-01-01

Scientific workflows have been used almost universally across scientific domains, and underpinned some of the most significant discoveries past several decades. Many these high computational, storage, and/or communication demands, thus must execute on a wide range large-scale platforms, from large clouds to upcoming exascale high-performance computing (HPC) platforms. These executions be managed using software infrastructure. Due popularity workflows, workflow management systems (WMSs)...

10.48550/arxiv.2103.09181 preprint EN cc-by-sa arXiv (Cornell University) 2021-01-01

The computational complexity of many key bioinformatics problems has resulted in numerous alternative heuristic solutions, where no single approach consistently outperforms all others. This creates difficulties for users trying to identify the most suitable tool their dataset and developers managing evaluating methods. As data volumes grow, deploying these methods becomes increasingly difficult, highlighting need standardized frameworks seamless deployment comparison HPC environments....

10.1101/2025.03.14.642603 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2025-03-17

Abstract Summary: We present the first parallel implementation of T-Coffee consistency-based multiple aligner. benchmark it on Amazon Elastic Cloud (EC2) and show that parallelization procedure is reasonably effective. also conclude for a web server with moderate usage (10K hits/month) cloud provides cost-effective alternative to in-house deployment. Availability: freeware open source package available from http://www.tcoffee.org/homepage.html Contact: cedric.notredame@crg.es

10.1093/bioinformatics/btq304 article EN Bioinformatics 2010-07-06

Phylogenetic reconstructions are essential in genomics data analyses and depend on accurate multiple sequence alignment (MSA) models. We show that all currently available large-scale progressive methods numerically unstable when dealing with amino-acid sequences. They produce significantly different output changing input order. used the HOMFAM protein sequences dataset to datasets larger than 100 sequences, this instability affects average 21.5% of aligned residues. The resulting Maximum...

10.1093/sysbio/syx096 article EN Systematic Biology 2018-03-20

Nextflow is a data-driven framework for computational pipelines that simplifies writing parallel and scalable in portable manner.

10.6084/m9.figshare.1254958.v2 article EN 2014-12-01

Most evolutionary analyses are based on pre-estimated multiple sequence alignment. Wong et al. established the existence of an uncertainty induced by alignment when reconstructing phylogenies. They were able to show that in many cases different aligners produce phylogenies, with no simple objective criterion sufficient distinguish among these alternatives.We demonstrate incorporating MSA into bootstrap sampling can significantly increase correlation between clade correctness and its...

10.1093/bioinformatics/btz082 article EN cc-by Bioinformatics 2019-02-05
Rafael Ferreira da Silva Rosa M. Badía Venkat Bala Debbie Bard Peer‐Timo Bremer and 95 more Ian K. Buckley Silvina Caíno‐Lores Kyle Chard Carole Goble Shantenu Jha Daniel S. Katz Daniel Laney Manish Parashar Frédéric Suter Nick Tyler Thomas D. Uram İlkay Altıntaş Stefan Andersson William Arndt Juan Pedro Aznar Jonathan Bader Bartosz Baliś Chris Blanton Kelly Rosa Braghetto Aharon Brodutch Paul Brunk Henri Casanova Alba Cervera Lierta Justin Chigu Tainã Coleman Nick Collier Iacopo Colonnelli Frederik Coppens Michael R. Crusoe W. S. Cunningham Bruno de Paula Kinoshita Paolo Di Tommaso Charles Doutriaux Matthew T. Downton Wael Elwasif Bjoern Enders Christopher Erdmann Thomas Fahringer Ludmilla Figueiredo Rosa Filgueira Martin Foltín Anne Fouilloux Luiz Gadelha Andy Gallo Artur Garcia Saez Daniel Garijo Roman G. Gerlach Ryan E. Grant Samuel Grayson Patricia Grubel Johan E. Gustafsson Valérie Hayot‐Sasson Óscar Hernández Marcus Hilbrich Annmary Justine I. Laflotte Fabian Lehmann André Luckow Jakob Luettgau Ketan Maheshwari Motohiko Matsuda Doriana Medić Peter Mendygral Marek T. Michalewicz Jorji Nonaka Maciej Pawlik Loïc Pottier Line Pouchard Mathias Pütz Santosh Kumar Radha Lavanya Ramakrishnan Sasko Ristov Paul Romano Daniel Rosendo Martin Ruefenacht Katarzyna Rycerz Nishant Saurabh V. Savchenko Martin Schulz Christine M. Simpson Raúl Sirvent Tyler J. Skluzacek Stian Soiland‐Reyes Renan P. Souza Sreenivas R. Sukumar Ziheng Sun Alan Sussman Douglas Thain Mikhail Titov Benjamín Tovar Aalap Tripathy Matteo Turilli Bartosz Tużnik Hubertus J. J. van Dam Aurelio Vivas

Scientific workflows have become integral tools in broad scientific computing use cases. Science discovery is increasingly dependent on to orchestrate large and complex experiments that range from execution of a cloud-based data preprocessing pipeline multi-facility instrument-to-edge-to-HPC computational workflows. Given the changing landscape evolving needs emerging applications, it paramount development novel system functionalities seek increase efficiency, resilience, pervasiveness...

10.48550/arxiv.2304.00019 preprint EN cc-by arXiv (Cornell University) 2023-01-01
Coming Soon ...