ComposerX: Multi-Agent Symbolic Music Composition With LLMs

FOS: Computer and information sciences Computer Science - Machine Learning Sound (cs.SD) Computer Science - Computation and Language Computer Science - Artificial Intelligence Computer Science - Sound Machine Learning (cs.LG) Multimedia (cs.MM) Artificial Intelligence (cs.AI) Audio and Speech Processing (eess.AS) FOS: Electrical engineering, electronic engineering, information engineering Computation and Language (cs.CL) Computer Science - Multimedia Electrical Engineering and Systems Science - Audio and Speech Processing
DOI: 10.5281/zenodo.14877129 Publication Date: 2024-01-01
ABSTRACT
Music composition represents the creative side of humanity, and itself is a complex task that requires abilities to understand and generate information with long dependency and harmony constraints. While demonstrating impressive capabilities in STEM subjects, current LLMs easily fail in this task, generating ill-written music even when equipped with modern techniques like In-Context-Learning and Chain-of-Thoughts. To further explore and enhance LLMs' potential in music composition by leveraging their reasoning ability and the large knowledge base in music history and theory, we propose ComposerX, an agent-based symbolic music generation framework. We find that applying a multi-agent approach significantly improves the music composition quality of GPT-4. The results demonstrate that ComposerX is capable of producing coherent polyphonic music compositions with captivating melodies, while adhering to user instructions.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....