Minkai Xu

ORCID: 0009-0007-9735-3767
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Machine Learning in Materials Science
  • Computational Drug Discovery Methods
  • Protein Structure and Dynamics
  • Topic Modeling
  • Generative Adversarial Networks and Image Synthesis
  • Advanced Graph Neural Networks
  • Advanced Neuroimaging Techniques and Applications
  • Reinforcement Learning in Robotics
  • Natural Language Processing Techniques
  • Model Reduction and Neural Networks
  • Multimodal Machine Learning Applications
  • Wireless Signal Modulation Classification
  • Digital Media Forensic Detection
  • Adversarial Robustness in Machine Learning
  • Speech and Audio Processing
  • RNA and protein synthesis mechanisms
  • Asymmetric Hydrogenation and Catalysis
  • Process Optimization and Integration
  • Graph Theory and Algorithms
  • Multi-Criteria Decision Making
  • Mathematical Biology Tumor Growth
  • Explainable Artificial Intelligence (XAI)
  • Opinion Dynamics and Social Influence
  • Image Retrieval and Classification Techniques
  • Human Motion and Animation

Stanford University
2023-2025

Palo Alto University
2023

Tongji University
2022

State Key Laboratory of Pollution Control and Resource Reuse
2022

Université de Montréal
2021

Microsoft (United States)
2020

Shanghai Jiao Tong University
2020

Microsoft Research (United Kingdom)
2020

Molecular graph generation is a fundamental problem for drug discovery and has been attracting growing attention. The challenging since it requires not only generating chemically valid molecular structures but also optimizing their chemical properties in the meantime. Inspired by recent progress deep generative models, this paper we propose flow-based autoregressive model called GraphAF. GraphAF combines advantages of both approaches enjoys: (1) high flexibility data density estimation; (2)...

10.48550/arxiv.2001.09382 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Predicting molecular conformations from graphs is a fundamental problem in cheminformatics and drug discovery. Recently, significant progress has been achieved with machine learning approaches, especially deep generative models. Inspired by the diffusion process classical non-equilibrium thermodynamics where heated particles will diffuse original states to noise distribution, this paper, we propose novel model named GeoDiff for conformation prediction. treats each atom as particle learns...

10.48550/arxiv.2203.02923 preprint EN cc-by arXiv (Cornell University) 2022-01-01

Proteins mediate their functions through chemical interactions; modeling these interactions, which are typically sidechains, is an important need in protein design. However, constructing all-atom generative model requires appropriate scheme for managing the jointly continuous and discrete nature of proteins encoded structure sequence. We describe diffusion structure, Protpardelle, represents all sidechain states at once as a “superposition” state; superpositions defining collapsed into...

10.1073/pnas.2311500121 article EN cc-by Proceedings of the National Academy of Sciences 2024-06-25

A fundamental problem in computational chemistry is to find a set of reactants synthesize target molecule, a.k.a. retrosynthesis prediction. Existing state-of-the-art methods rely on matching the molecule with large reaction templates, which are very computationally expensive and also suffer from coverage. In this paper, we propose novel template-free approach called G2Gs by transforming molecular graph into reactant graphs. first splits synthons identifying centers, then translates final...

10.48550/arxiv.2003.12725 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Advances in artificial intelligence (AI) are fueling a new paradigm of discoveries natural sciences. Today, AI has started to advance sciences by improving, accelerating, and enabling our understanding phenomena at wide range spatial temporal scales, giving rise area research known as for science (AI4Science). Being an emerging paradigm, AI4Science is unique that it enormous highly interdisciplinary area. Thus, unified technical treatment this field needed yet challenging. This work aims...

10.48550/arxiv.2307.08423 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Generative models, especially diffusion models (DMs), have achieved promising results for generating feature-rich geometries and advancing foundational science problems such as molecule design. Inspired by the recent huge success of Stable (latent) Diffusion we propose a novel principled method 3D generation named Geometric Latent Models (GeoLDM). GeoLDM is first latent DM model molecular geometry domain, composed autoencoders encoding structures into continuous codes DMs operating in space....

10.48550/arxiv.2305.01140 preprint EN cc-by arXiv (Cornell University) 2023-01-01

A bstract Proteins mediate their functions through chemical interactions; modeling these interactions, which are typically sidechains, is an important need in protein design. However, constructing all-atom generative model requires appropriate scheme for managing the jointly continuous and discrete nature of proteins encoded structure sequence. We describe diffusion structure, Protpardelle, instantiates a “superposition” over possible sidechain states, collapses it to conduct reverse sample...

10.1101/2023.05.24.542194 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2023-05-25

We study a fundamental problem in computational chemistry known as molecular conformation generation, trying to predict stable 3D structures from 2D graphs. Existing machine learning approaches usually first distances between atoms and then generate structure satisfying the distances, where noise predicted may induce extra errors during coordinate generation. Inspired by traditional force field methods for dynamics simulation, this paper, we propose novel approach called ConfGF directly...

10.48550/arxiv.2105.03902 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Homophily principle, i.e., nodes with the same labels are more likely to be connected, has been believed main reason for performance superiority of Graph Neural Networks (GNNs) over on node classification tasks. Recent research suggests that, even in absence homophily, advantage GNNs still exists as long from class share similar neighborhood patterns. However, this argument only considers intra-class Node Distinguishability (ND) but neglects inter-class ND, which provides incomplete...

10.48550/arxiv.2304.14274 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Molecule generation is a very important practical problem, with uses in drug discovery and material design, AI methods promise to provide useful solutions. However, existing for molecule focus either on 2D graph structure or 3D geometric structure, which not sufficient represent complete as captures mainly topology while geometry spatial atom arrangements. Combining these representations essential better molecule. In this paper, we present new model generating comprehensive representation of...

10.48550/arxiv.2304.14621 preprint EN public-domain arXiv (Cornell University) 2023-01-01

We introduce Buffer of Thoughts (BoT), a novel and versatile thought-augmented reasoning approach for enhancing accuracy, efficiency robustness large language models (LLMs). Specifically, we propose meta-buffer to store series informative high-level thoughts, namely thought-template, distilled from the problem-solving processes across various tasks. Then each problem, retrieve relevant thought-template adaptively instantiate it with specific structures conduct efficient reasoning. To...

10.48550/arxiv.2406.04271 preprint EN arXiv (Cornell University) 2024-06-06

Powder X-ray diffraction (PXRD) is a cornerstone technique in materials characterization. However, complete structure determination from PXRD patterns alone remains time-consuming and often intractable, especially for novel materials. Current machine learning (ML) approaches to analysis predict only subset of the total information that comprises crystal structure. We developed pioneering generative ML model designed solve structures real-world experimental data. In addition strong...

10.1021/jacs.4c10244 article EN Journal of the American Chemical Society 2024-09-19

We study how to generate molecule conformations (i.e., 3D structures) from a molecular graph. Traditional methods, such as dynamics, sample via computationally expensive simulations. Recently, machine learning methods have shown great potential by training on large collection of conformation data. Challenges arise the limited model capacity for capturing complex distributions and difficulty in modeling long-range dependencies between atoms. Inspired recent progress deep generative models,...

10.48550/arxiv.2102.10240 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Homophily principle, \ie{} nodes with the same labels or similar attributes are more likely to be connected, has been commonly believed main reason for superiority of Graph Neural Networks (GNNs) over traditional (NNs) on graph-structured data, especially node-level tasks. However, recent work identified a non-trivial set datasets where GNN's performance compared NN's is not satisfactory. Heterophily, i.e. low homophily, considered cause this empirical observation. People have begun revisit...

10.48550/arxiv.2407.09618 preprint EN arXiv (Cornell University) 2024-07-12

Coarse-graining (CG) of molecular simulations simplifies the particle representation by grouping selected atoms into pseudo-beads and drastically accelerates simulation. However, such CG procedure induces information losses, which makes accurate backmapping, i.e., restoring fine-grained (FG) coordinates from coordinates, a long-standing challenge. Inspired recent progress in generative models equivariant networks, we propose novel model that rigorously embeds vital probabilistic nature...

10.48550/arxiv.2201.12176 preprint EN cc-by arXiv (Cornell University) 2022-01-01

We tackle a common scenario in imitation learning (IL), where agents try to recover the optimal policy from expert demonstrations without further access or environment reward signals. Except simple Behavior Cloning (BC) that adopts supervised followed by problem of compounding error, previous solutions like inverse reinforcement (IRL) and recent generative adversarial methods involve bi-level alternating optimization for updating function policy, suffering high computational cost training...

10.48550/arxiv.2004.09395 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Modeling the complex three-dimensional (3D) dynamics of relational systems is an important problem in natural sciences, with applications ranging from molecular simulations to particle mechanics. Machine learning methods have achieved good success by graph neural networks model spatial interactions. However, these approaches do not faithfully capture temporal correlations since they only next-step predictions. In this work, we propose Equivariant Graph Neural Operator (EGNO), a novel and...

10.48550/arxiv.2401.11037 preprint EN cc-by arXiv (Cornell University) 2024-01-01

Generative models have shown great promise in generating 3D geometric systems, which is a fundamental problem many natural science domains such as molecule and protein design. However, existing approaches only operate on static structures, neglecting the fact that physical systems are always dynamic nature. In this work, we propose trajectory diffusion (GeoTDM), first model for modeling temporal distribution of trajectories. Modeling challenging it requires capturing both complex spatial...

10.48550/arxiv.2410.13027 preprint EN arXiv (Cornell University) 2024-10-16
Coming Soon ...