Bridging Molecular Graphs and Large Language Models
DOI:
10.1609/aaai.v39i20.35422
Publication Date:
2025-04-11T13:13:28Z
AUTHORS (3)
ABSTRACT
While Large Language Models (LLMs) have shown exceptional generalization capabilities, their ability to process graph data, such as molecular structures, remains limited. To bridge this gap, paper proposes Graph2Token, an efficient solution that aligns tokens LLM tokens. The key idea is represent a token with the vocabulary, without fine-tuning backbone. achieve goal, we first construct molecule-text paired dataset from multi-sources, including CHEBI and HMDB, train structure encoder, which reduces distance between graphs texts representations in feature space. Then, propose novel alignment strategy associates further unleash potential of LLMs, collect IUPAC name identifiers, are incorporated into prompts. By aligning special tokens, can activate LLMs' few-shot learning. Extensive experiments on classification regression tasks demonstrate effectiveness our proposed Graph2Token.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (0)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....