NFDI4DS | UHH-SEMS - Publication Details

Approximate Policy Iteration with Linear Action Models

0502 economics and business 05 social sciences

DOI: 10.1609/aaai.v26i1.8319 Publication Date: 2022-06-01T20:35:27Z

Abstract Supplemental Material References Cited by

AUTHORS (2)

Hengshuai Yao

Csaba Szepesvari

ABSTRACT

In this paper we consider the problem of finding a good policy given some batch data.We propose a new approach, LAM-API, that first builds a so-called linear action model (LAM) from the data and then uses the learned model and the collected data in approximate policy iteration (API) to find a good policy.A natural choice for the policy evaluation step in this algorithm is to use least-squares temporal difference (LSTD) learning algorithm.Empirical results on three benchmark problems show that this particular instance of LAM-API performs competitively as compared with LSPI, both from the point of view of data and computational efficiency.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES (0)

CITATIONS (0)

EXTERNAL LINKS

CROSSREF - Publications OPENAIRE - Products

PlumX Metrics

Approximate Policy Iteration with Linear Action Models

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....