NFDI4DS | UHH-SEMS - Publication Details

Policy Evaluation in Distributional LQR (Extended Version)

Representation Expected value

DOI: 10.48550/arxiv.2401.10240 Publication Date: 2024-01-01

Abstract Supplemental Material References Cited by

AUTHORS (6)

Zifan Wang

Yulong Gao

Siyi Wang

Michael M. Zavlanos

Alessandro Abate

Karl Henrik Johan...

ABSTRACT

Distributional reinforcement learning (DRL) enhances the understanding of effects randomness in environment by letting agents learn distribution a random return, rather than its expected value as standard learning. Meanwhile, challenge DRL is that policy evaluation typically relies on representation return distribution, which needs to be carefully designed. In this paper, we address for special class problems rely discounted linear quadratic regulator (LQR), call \emph{distributional LQR}. Specifically, provide closed-form expression applicable all types exogenous disturbance long it independent and identically distributed (i.i.d.). We show variance bounded if fourth moment bounded. Furthermore, investigate sensitivity model perturbations. While proposed exact consists infinitely many variables, can well approximated finite number variables. The associated approximation error analytically under mild assumptions. When unknown, propose model-free approach estimating supported sample complexity guarantees. Finally, extend our partially observable systems. Numerical experiments are provided illustrate theoretical results.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products

PlumX Metrics

Policy Evaluation in Distributional LQR (Extended Version)

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....