DTPPO: Dual-Transformer Encoder-Based Proximal Policy Optimization for Multi-UAV Navigation in Unseen Complex Environments

Obstacle avoidance Retraining Transferability
DOI: 10.3390/drones8120720 Publication Date: 2024-12-03T16:48:56Z
ABSTRACT
Existing multi-agent deep reinforcement learning (MADRL) methods for multi-UAV navigation face challenges in generalization, particularly when applied to unseen complex environments. To address these limitations, we propose a Dual-Transformer Encoder-Based Proximal Policy Optimization (DTPPO) method. DTPPO enhances collaboration through Spatial Transformer, which models inter-agent dynamics, and Temporal captures temporal dependencies improve generalization across diverse This architecture allows UAVs navigate new, environments without retraining. Extensive simulations demonstrate that outperforms current MADRL terms of transferability, obstacle avoidance, efficiency with varying densities. The results confirm DTPPO’s effectiveness as robust solution both known scenarios.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (38)
CITATIONS (0)