NFDI4DS | UHH-SEMS - Publication Details

Regret Bounds for Markov Decision Processes with Recursive Optimized Certainty Equivalents

Certainty Value (mathematics)

DOI: 10.48550/arxiv.2301.12601 Publication Date: 2023-01-01

Abstract Supplemental Material References Cited by

AUTHORS (3)

W. Xu

Xuefeng Gao

Xuedong He

ABSTRACT

The optimized certainty equivalent (OCE) is a family of risk measures that cover important examples such as entropic risk, conditional value-at-risk and mean-variance models. In this paper, we propose new episodic risk-sensitive reinforcement learning formulation based on tabular Markov decision processes with recursive OCEs. We design an efficient algorithm for problem value iteration upper confidence bound. derive bound the regret proposed algorithm, also establish minimax lower Our bounds show rate achieved by our has optimal dependence number episodes actions.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products

PlumX Metrics

Regret Bounds for Markov Decision Processes with Recursive Optimized Certainty Equivalents

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....