GENOME: GenerativE Neuro-symbOlic visual reasoning by growing and reusing ModulEs

Closed captioning Code (set theory)
DOI: 10.48550/arxiv.2311.04901 Publication Date: 2023-01-01
ABSTRACT
Recent works have shown that Large Language Models (LLMs) could empower traditional neuro-symbolic models via programming capabilities to translate language into module descriptions, thus achieving strong visual reasoning results while maintaining the model's transparency and efficiency. However, these usually exhaustively generate entire code snippet given each new instance of a task, which is extremely ineffective. We propose generative by growing reusing modules. Specifically, our model consists three unique stages, initialization, generation, execution. First, vision-language we adopt LLMs examine whether reuse grow over established modules handle this task. If not, initialize needed task specify inputs outputs module. After that, created querying corresponding snippets match requirements. In order get better sense module's ability, treat few-shot training examples as test cases see if pass cases. yes, added library for future reuse. Finally, evaluate performance on testing set executing parsed programs with newly made results. find proposed possesses several advantages. it performs competitively standard tasks like question answering referring expression comprehension; Second, learned from one can be seamlessly transferred tasks; Last but not least, able adapt observing few
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....