NFDI4DS | UHH-SEMS - Publication Details

Gabriele Corso

ORCID: 0000-0002-1963-8755

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5057871229

Research Areas

Protein Structure and Dynamics
Bioinformatics and Genomic Networks
Machine Learning in Bioinformatics
Computational Drug Discovery Methods
Advanced Graph Neural Networks
Machine Learning in Materials Science
Generative Adversarial Networks and Image Synthesis
Gene expression and cancer classification
Domain Adaptation and Few-Shot Learning
RNA and protein synthesis mechanisms
Cell Image Analysis Techniques
Advanced Neuroimaging Techniques and Applications
Bayesian Methods and Mixture Models
Bacterial Genetics and Biotechnology
Neural Networks and Applications
Natural Language Processing Techniques
Topological and Geometric Data Analysis
Genomics and Chromatin Dynamics
Gene Regulatory Network Analysis
Advanced Graph Theory Research
Graph Theory and Algorithms
Machine Learning and Algorithms
Data Management and Algorithms
Genetics, Bioinformatics, and Biomedical Research
Mathematical Biology Tumor Growth

Massachusetts Institute of Technology
2021-2024

Moscow Institute of Thermal Technology
2022

University of Cambridge
2020-2021

Principal Neighbourhood Aggregation for Graph Nets

OPENALEX - Publications

Gabriele Corso Luca Cavalleri Dominique Beaini Píetro Lió Petar Veličković

Graph Neural Networks (GNNs) have been shown to be effective models for different predictive tasks on graph-structured data. Recent work their expressive power has focused isomorphism and countable feature spaces. We extend this theoretical framework include continuous features - which occur regularly in real-world input domains within the hidden layers of GNNs we demonstrate requirement multiple aggregation functions context. Accordingly, propose Principal Neighbourhood Aggregation (PNA), a...

10.48550/arxiv.2004.05718 preprint EN other-oa arXiv (Cornell University) 2020-01-01

DiffDock: Diffusion Steps, Twists, and Turns for Molecular Docking

OPENALEX - Publications

Gabriele Corso H. Stärk Bowen Jing Regina Barzilay Tommi Jaakkola

Predicting the binding structure of a small molecule ligand to protein -- task known as molecular docking is critical drug design. Recent deep learning methods that treat regression problem have decreased runtime compared traditional search-based but yet offer substantial improvements in accuracy. We instead frame generative modeling and develop DiffDock, diffusion model over non-Euclidean manifold poses. To do so, we map this product space degrees freedom (translational, rotational,...

10.48550/arxiv.2210.01776 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Graph neural networks

OPENALEX - Publications

Gabriele Corso H. Stärk Stefanie Jegelka Tommi Jaakkola Regina Barzilay

10.1038/s43586-024-00294-7 article EN Nature Reviews Methods Primers 2024-03-07

Boltz-1: Democratizing Biomolecular Interaction Modeling

OPENALEX - Publications

Jeremy Wohlwend Gabriele Corso Saro Passaro Mateo Reveiz Ken Leidal and 6 more

Understanding biomolecular interactions is fundamental to advancing fields like drug discovery and protein design. In this paper, we introduce Boltz-1, an open-source deep learning model incorporating innovations in architecture, speed optimization, data processing achieving A lpha F old 3-level accuracy predicting the 3D structures of complexes. Boltz-1 demonstrates a performance on-par with state-of-the-art commercial models on range diverse benchmarks, setting new benchmark for...

10.1101/2024.11.19.624167 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2024-11-20

Torsional Diffusion for Molecular Conformer Generation

OPENALEX - Publications

Bowen Jing Gabriele Corso Jeffrey T. Chang Regina Barzilay Tommi Jaakkola

Molecular conformer generation is a fundamental task in computational chemistry. Several machine learning approaches have been developed, but none outperformed state-of-the-art cheminformatics methods. We propose torsional diffusion, novel diffusion framework that operates on the space of torsion angles via process hypertorus and an extrinsic-to-intrinsic score model. On standard benchmark drug-like molecules, generates superior ensembles compared to methods terms both RMSD chemical...

10.48550/arxiv.2206.01729 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Diffusion models in protein structure and docking

OPENALEX - Publications

Jason Yim H. Stärk Gabriele Corso Bowen Jing Regina Barzilay and 1 more

Abstract Generative AI is rapidly transforming the frontier of research in computational structural biology. Indeed, recent successes have substantially advanced protein design and drug discovery. One key methodologies underlying these advances diffusion models (DM). Diffusion originated computer vision, taking over image generation offering superior quality performance. These were subsequently extended modified for uses other areas including DMs are well equipped to model high dimensional,...

10.1002/wcms.1711 article EN cc-by-nc Wiley Interdisciplinary Reviews Computational Molecular Science 2024-03-01

PLINDER: The protein-ligand interactions dataset and evaluation resource

OPENALEX - Publications

Janani Durairaj Yusuf Adeshina Zhonglin Cao Xuejin Zhang Vladas Oleinikovas and 20 more

Abstract Protein-ligand interactions (PLI) are foundational to small molecule drug design. With computational methods striving towards experimental accuracy, there is a critical demand for well-curated and diverse PLI dataset. Existing datasets often limited in size diversity, commonly used evaluation sets suffer from training information leakage, hindering the realistic assessment of method generalization capabilities. To address these shortcomings, we present PLIN-DER, largest most...

10.1101/2024.07.17.603955 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2024-07-17

EigenFold: Generative Protein Structure Prediction with Diffusion Models

OPENALEX - Publications

Bowen Jing Ezra Erives Peter Pao-Huang Gabriele Corso Bonnie Berger and 1 more

Protein structure prediction has reached revolutionary levels of accuracy on single structures, yet distributional modeling paradigms are needed to capture the conformational ensembles and flexibility that underlie biological function. Towards this goal, we develop EigenFold, a diffusion generative framework for sampling distribution structures from given protein sequence. We define process models as system harmonic oscillators which naturally induces cascading-resolution along eigenmodes...

10.48550/arxiv.2304.02198 preprint EN other-oa arXiv (Cornell University) 2023-01-01

PINDER: The protein interaction dataset and evaluation resource

OPENALEX - Publications

Daniel Kovtun Mehmet Akdel Alexander Goncearenco Guoqing Zhou Graham T. Holt and 15 more

Abstract Protein-protein interactions (PPIs) are fundamental to understanding biological processes and play a key role in therapeutic advancements. As deep-learning docking methods for PPIs gain traction, benchmarking protocols datasets tailored effective training evaluation of their generalization capabilities performance across real-world scenarios become imperative. Aiming overcome limitations existing approaches, we introduce PINDER, comprehensive annotated dataset that uses structural...

10.1101/2024.07.17.603980 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2024-07-19

Blind protein-ligand docking with diffusion-based deep generative models

OPENALEX - Publications

Gabriele Corso Bowen Jing Hannes Stark Regina Barzilay Tommi Jaakkola

10.1016/j.bpj.2022.11.937 article EN publisher-specific-oa Biophysical Journal 2023-02-01

DiffDock-PP: Rigid Protein-Protein Docking with Diffusion Models

OPENALEX - Publications

Mohamed Amine Ketata Cedrik Laue Ruslan Mammadov H. Stärk Menghua Wu and 4 more

Understanding how proteins structurally interact is crucial to modern biology, with applications in drug discovery and protein design. Recent machine learning methods have formulated protein-small molecule docking as a generative problem significant performance boosts over both traditional deep baselines. In this work, we propose similar approach for rigid protein-protein docking: DiffDock-PP diffusion model that learns translate rotate unbound structures into their bound conformations. We...

10.48550/arxiv.2304.03889 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Neural Distance Embeddings for Biological Sequences

OPENALEX - Publications

Gabriele Corso Rex Ying Michal Pándy Petar Veličković Jure Leskovec and 1 more

The development of data-dependent heuristics and representations for biological sequences that reflect their evolutionary distance is critical large-scale research. However, popular machine learning approaches, based on continuous Euclidean spaces, have struggled with the discrete combinatorial formulation edit models evolution hierarchical relationship characterises real-world datasets. We present Neural Distance Embeddings (NeuroSEED), a general framework to embed in geometric vector...

10.48550/arxiv.2109.09740 preprint EN other-oa arXiv (Cornell University) 2021-01-01

DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents

OPENALEX - Publications

Yilun Xu Gabriele Corso Tommi Jaakkola Arash Vahdat Karsten Kreis

Diffusion models (DMs) have revolutionized generative learning. They utilize a diffusion process to encode data into simple Gaussian distribution. However, encoding complex, potentially multimodal distribution single continuous arguably represents an unnecessarily challenging learning problem. We propose Discrete-Continuous Latent Variable Models (DisCo-Diff) simplify this task by introducing complementary discrete latent variables. augment DMs with learnable latents, inferred encoder, and...

10.48550/arxiv.2407.03300 preprint EN arXiv (Cornell University) 2024-07-03

Particle Guidance: non-I.I.D. Diverse Sampling with Diffusion Models

OPENALEX - Publications

Gabriele Corso Yilun Xu Valentin De Bortoli Regina Barzilay Tommi Jaakkola

In light of the widespread success generative models, a significant amount research has gone into speeding up their sampling time. However, models are often sampled multiple times to obtain diverse set incurring cost that is orthogonal We tackle question how improve diversity and sample efficiency by moving beyond common assumption independent samples. propose particle guidance, an extension diffusion-based where joint-particle time-evolving potential enforces diversity. analyze...

10.48550/arxiv.2310.13102 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Subspace Diffusion Generative Models

OPENALEX - Publications

Bowen Jing Gabriele Corso Renato Berlinghieri Tommi Jaakkola

Score-based models generate samples by mapping noise to data (and vice versa) via a high-dimensional diffusion process. We question whether it is necessary run this entire process at high dimensionality and incur all the inconveniences thereof. Instead, we restrict projections onto subspaces as distribution evolves toward noise. When applied state-of-the-art models, our framework simultaneously improves sample quality -- reaching an FID of 2.17 on unconditional CIFAR-10 reduces computational...

10.48550/arxiv.2205.01490 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Learning Graph Search Heuristics

OPENALEX - Publications

Michal Pándy Weikang Qiu Gabriele Corso Petar Veličković Rex Ying and 2 more

Searching for a path between two nodes in graph is one of the most well-studied and fundamental problems computer science. In numerous domains such as robotics, AI, or biology, practitioners develop search heuristics to accelerate their pathfinding algorithms. However, it laborious complex process hand-design based on problem structure given use case. Here we present PHIL (Path Heuristic with Imitation Learning), novel neural architecture training algorithm discovering navigation from data...

10.48550/arxiv.2212.03978 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Graph Anisotropic Diffusion

OPENALEX - Publications

Ahmed A. A. Elhag Gabriele Corso H. Stärk Michael M. Bronstein

Traditional Graph Neural Networks (GNNs) rely on message passing, which amounts to permutation-invariant local aggregation of neighbour features. Such a process is isotropic and there no notion `direction' the graph. We present new GNN architecture called Anisotropic Diffusion. Our model alternates between linear diffusion, for closed-form solution available, anisotropic filters obtain efficient multi-hop kernels. test our two common molecular property prediction benchmarks (ZINC QM9) show...

10.48550/arxiv.2205.00354 preprint EN other-oa arXiv (Cornell University) 2022-01-01

ICLR 2021 Challenge for Computational Geometry & Topology: Design and Results

OPENALEX - Publications

Nina Miolane Matteo Caorsi Umberto Lupo Marius Guerard Nicolas Guigui and 28 more

This paper presents the computational challenge on differential geometry and topology that happened within ICLR 2021 workshop "Geometric Topological Representation Learning". The competition asked participants to provide creative contributions fields of through open-source repositories Geomstats Giotto-TDA. attracted 16 teams in its two month duration. describes design summarizes main findings.

10.48550/arxiv.2108.09810 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Coming Soon ...