Dynamic preference inference network: Improving sample efficiency for multi-objective reinforcement learning by preference estimation

Sample (material) Preference learning
DOI: 10.1016/j.knosys.2024.112512 Publication Date: 2024-09-11T21:00:30Z