Zero-shot counting with a dual-stream neural network model

Zero (linguistics) One shot
DOI: 10.1016/j.neuron.2024.10.008 Publication Date: 2024-11-01T14:38:06Z
ABSTRACT
To understand a visual scene, observers need to both recognize objects and encode relational structure. For example, scene comprising three apples requires the observer concepts of "apple" "three." In primate brain, these functions rely on dual (ventral dorsal) processing streams. Object recognition in primates has been successfully modeled with deep neural networks, but how structure (including numerosity) is encoded remains poorly understood. Here, we built learning model, based dual-stream architecture which able count items "zero-shot"-even if themselves are unfamiliar. Our network forms spatial response fields lognormal number codes that resemble those observed macaque posterior parietal cortex. The also makes successful predictions about human counting behavior. results provide evidence for an enactive theory role cortex understanding.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (80)
CITATIONS (1)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....