Scalable Decision-Focused Learning in Restless Multi-Armed Bandits with Application to Maternal and Child Health
FOS: Computer and information sciences
Computer Science - Machine Learning
Artificial Intelligence (cs.AI)
Computer Science - Artificial Intelligence
0202 electrical engineering, electronic engineering, information engineering
02 engineering and technology
Machine Learning (cs.LG)
DOI:
10.1609/aaai.v37i10.26431
Publication Date:
2023-06-27T18:01:54Z
AUTHORS (8)
ABSTRACT
This paper studies restless multi-armed bandit (RMAB) problems with unknown arm transition dynamics but known correlated features. The goal is to learn a model predict given features, where the Whittle index policy solves RMAB using predicted transitions. However, prior works often by maximizing predictive accuracy instead of final solution quality, causing mismatch between training and evaluation objectives. To address this shortcoming, we propose novel approach for decision-focused learning in that directly trains maximize quality. We present three key contributions: (i) establish differentiability support learning; (ii) significantly improve scalability approaches sequential problems, specifically problems; (iii) apply our algorithm previously collected dataset maternal child health demonstrate its performance. Indeed, first scales real-world problem sizes.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (1)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....