Learning to communicate using a communication critic and counterfactual reasoning

Mass communications Mathematics
DOI: 10.1007/s00521-024-10598-0 Publication Date: 2025-01-10T08:01:30Z
ABSTRACT
Learning to communicate in order share state information is an active problem the area of multi-agent reinforcement learning. The credit assignment problem, non-stationarity communication environment and encouraging agents be influenced by incoming messages are major challenges within this research field which need overcome learn a valid protocol. This paper introduces novel counterfactual learning (MACC) method adapts reasoning for communicating agents. Next, environment, while Q-function, creating Q-function using action policy other environment. As exact create can computationally intensive large number agents, two approximation methods proposed. Additionally, social loss function introduced influenceable required Our experiments show that MACC able outperform state-of-the-art baselines four different scenarios particle Finally, we demonstrate scalability matrix
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (34)
CITATIONS (0)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....