- Reinforcement Learning in Robotics
- Advanced Multi-Objective Optimization Algorithms
- Advanced Bandit Algorithms Research
- Game Theory and Applications
- Auction Theory and Applications
- Artificial Intelligence in Games
- Evolutionary Algorithms and Applications
- Adaptive Dynamic Programming Control
- Smart Grid Energy Management
- Experimental Behavioral Economics Studies
- COVID-19 epidemiological studies
- Optimization and Search Problems
- Bayesian Modeling and Causal Inference
- Machine Learning and Algorithms
- Metaheuristic Optimization Algorithms Research
- Economic theories and models
- Water resources management and optimization
- Influenza Virus Research Studies
- Advanced Control Systems Optimization
- Modular Robots and Swarm Intelligence
- Adversarial Robustness in Machine Learning
- Digital Games and Media
- Simulation Techniques and Applications
- Gaussian Processes and Bayesian Inference
- Process Optimization and Integration
Vrije Universiteit Brussel
2017-2024
University of Applied Sciences Utrecht
2019-2023
Amsterdam University of Applied Sciences
2023
Vrije Universiteit Amsterdam
2018-2021
University of Amsterdam
2013-2021
Utrecht University
2012-2020
Amsterdam UMC Location Vrije Universiteit Amsterdam
2018
University of Surrey
2018
University of Oxford
2016-2017
Amsterdam University of the Arts
2014
Sequential decision-making problems with multiple objectives arise naturally in practice and pose unique challenges for research decision-theoretic planning learning, which has largely focused on single-objective settings. This article surveys algorithms designed sequential objectives. Though there is a growing body of literature this subject, little it makes explicit under what circumstances special methods are needed to solve multi-objective problems. Therefore, we identify three distinct...
Real-world decision-making tasks are generally complex, requiring trade-offs between multiple, often conflicting, objectives. Despite this, the majority of research in reinforcement learning and decision-theoretic planning either assumes only a single objective, or that multiple objectives can be adequately handled via simple linear combination. Such approaches may oversimplify underlying problem hence produce suboptimal results. This paper serves as guide to application multi-objective...
Many real-world decision problems have multiple objectives. For example, when choosing a medical treatment plan, we want to maximize the efficacy of treatment, but also minimize side effects.
The number of Electric Vehicle (EV) owners is expected to significantly increase in the near future, since EVs are regarded as valuable assets both for transportation and energy storage purposes. However, recharging a large fleet during peak hours may overload transformers distribution grid. Although several methods have been proposed flatten peak-hour loads recharge fairly possible available time, these typically focus either on single type tariff or making strong assumptions regarding In...
In this paper we propose an approach for personalising the space in which a game is played (i.e., levels) dependent on classifications of user's facial expression — to end tailoring affective experience individual user. Our aimed at online personalisation, i.e., personalised during actual play game. A key insight that personalisation techniques can leverage novel computer vision-based unobtrusively infer player experiences automatically based analysis. Specifically, user, (1) proven InSight...
We propose Deep Optimistic Linear Support Learning (DOL) to solve high-dimensional multi-objective decision problems where the relative importances of objectives are not known a priori. Using features from inputs, DOL computes convex coverage set containing all potential optimal solutions combinations objectives. To our knowledge, this is first time that deep reinforcement learning has succeeded in policies. In addition, we provide testbed with two experiments be used as benchmark for learning.
Reinforcement learning (RL) aims at building a policy that maximizes task-related reward within given domain. When the domain is known, i.e., when its states, actions and are defined, Markov Decision Processes (MDPs) provide convenient theoretical framework to formalize RL. But in an open-ended process, agent or robot must solve unbounded sequence of tasks not known advance corresponding MDPs cannot be built design time. This defines main challenges learning: how can learn behave...
Energy disaggregation, a.k.a. Non-Intrusive Load Monitoring, aims to separate the energy consumption of individual appliances from readings a mains power meter measuring total of, e.g., whole house. can be useful in many applications, providing appliance-level feedback end users help them understand their and ultimately save energy. Recently, with availability large-scale datasets, various neural network models such as convolutional networks recurrent have been investigated solve...
Abstract The recent paper “Reward is Enough” by Silver, Singh, Precup and Sutton posits that the concept of reward maximisation sufficient to underpin all intelligence, both natural artificial, provides a suitable basis for creation artificial general intelligence. We contest underlying assumption Silver et al. such can be scalar-valued. In this we explain why scalar rewards are insufficient account some aspects biological computational argue in favour explicitly multi-objective models...
Abstract In this paper, we introduce multi-objective deep centralized multi-agent actor-critic (MO-DCMAC), a reinforcement learning method for infrastructural maintenance optimization, an area traditionally dominated by single-objective (RL) approaches. Previous RL methods combine multiple objectives, such as probability of collapse and cost, into singular reward signal through reward-shaping. contrast, MO-DCMAC can optimize policy objectives directly, even when the utility function is...
In this article, we propose new algorithms for multi-objective coordination graphs (MO-CoGs). Key to the efficiency of these is that they compute a convex coverage set (CCS) instead Pareto (PCS). Not only CCS sufficient solution large class problems, it also has important characteristics facilitate more efficient solutions. We two main computing in MO-CoGs. Convex variable elimination (CMOVE) computes by performing series agent eliminations, which can be seen as solving local subproblems....
In multi-objective decision planning and learning, much attention is paid to producing optimal solution sets that contain an policy for every possible user preference profile. We argue the step follows, i.e, determining which execute by maximising user's intrinsic utility function over this (possibly infinite) set, under-studied. This paper aims fill gap. build on previous work Gaussian processes pairwise comparisons modelling, extend it support scenario, propose new ordered elicitation...
Many real-world decision problems require making trade-offs among multiple objectives. However, in some cases, the relative importance of these objectives is not known when problem solved, precluding use single-objective methods. Instead, multi-objective methods, which compute set all potentially useful solutions, are required. This paper proposes variable elimination linear support (VELS), a new algorithm for multi-agent coordination that exploits loose couplings to convex coverage (CCS):...
Abstract In multi-objective multi-agent systems (MOMASs), agents explicitly consider the possible trade-offs between conflicting objective functions. We argue that compromises competing objectives in MOMAS should be analyzed on basis of utility these have for users a system, where an agent’s function maps their payoff vectors to scalar values. This utility-based approach naturally leads two different optimization criteria MOMAS: expected scalarized returns (ESRs) and (SERs). this article, we...