- Reinforcement Learning in Robotics
- Topic Modeling
- Advanced Bandit Algorithms Research
- Adversarial Robustness in Machine Learning
- Recommender Systems and Techniques
- Experimental Behavioral Economics Studies
- Multi-Agent Systems and Negotiation
- Multimodal Machine Learning Applications
- Distributed Control Multi-Agent Systems
- Machine Learning and Data Classification
- Evolutionary Game Theory and Cooperation
- Data Stream Mining Techniques
- Artificial Intelligence in Games
- Adaptive Dynamic Programming Control
- Natural Language Processing Techniques
- Game Theory and Applications
- Domain Adaptation and Few-Shot Learning
- Auction Theory and Applications
- Sports Analytics and Performance
- Bayesian Modeling and Causal Inference
- Advanced Text Analysis Techniques
- Complex Network Analysis Techniques
- Decision-Making and Behavioral Economics
- Logic, Reasoning, and Knowledge
- Image and Video Quality Assessment
King's College London
2021-2025
University College London
2021-2022
University of Technology Sydney
2017-2020
Dongbei University of Finance and Economics
2017
Central University of Finance and Economics
2017
In this article, we study the reinforcement learning (RL) for vehicle routing problems (VRPs). Recent works have shown that attention-based RL models outperform recurrent neural network-based methods on these in terms of both effectiveness and efficiency. However, existing simply aggregate node embeddings to generate context embedding without taking into account dynamic network structures, making them incapable modeling state transition action selection dynamics. work, develop a new model...
Recommendation systems have become ubiquitous in online shopping recent decades due to their power reducing excessive choices of customers and industries. Recent collaborative filtering methods based on the deep neural network are studied introduce promising results learning hidden representations for users items. However, it has revealed its vulnerabilities under malicious user attacks. With knowledge a algorithm parameters, performance this recommendation system can be easily downgraded....
As autonomous agents become more prevalent, understanding their collective behaviour in strategic interactions is crucial. This study investigates the emergent cooperative tendencies of systems Large Language Model (LLM) a social dilemma. Unlike previous research where LLMs output individual actions, we prompt state-of-the-art to generate complete strategies for iterated Prisoner's Dilemma. Using evolutionary game theory, simulate populations with different dispositions (aggressive,...
This paper introduces a novel method for estimating the self-interest level of computationally intractable Markov social dilemmas. We extend concept from normal-form games to games, providing quantitative measure minimum reward exchange required incentivize cooperation by aligning individual and collective interests. demonstrate our on three environments Melting Pot suite: which represent either common-pool resources or public goods. Our results show that proposed successfully identifies...
Reward engineering is one of the key challenges in Reinforcement Learning (RL). Preference-based RL effectively addresses this issue by learning from human feedback. However, it both time-consuming and expensive to collect preference labels. In paper, we propose a novel \textbf{V}ision-\textbf{L}anguage \textbf{P}reference framework, named \textbf{VLP}, which learns vision-language model provide feedback for embodied manipulation tasks. To achieve this, define three types...
It is a long-standing question to discover causal relations among set of variables in many empirical sciences. Recently, Reinforcement Learning (RL) has achieved promising results discovery from observational data. However, searching the space directed graphs and enforcing acyclicity by implicit penalties tend be inefficient restrict existing RL-based method small scale problems. In this work, we propose novel approach for discovery, incorporating RL into ordering-based paradigm....
The primary challenge in the development of large-scale artificial intelligence (AI) systems lies achieving scalable decision-making—extending AI models while maintaining sufficient performance. Existing research indicates that distributed can improve scalability by decomposing complex tasks and distributing them across collaborative nodes. However, previous technologies suffered from compromised real-world applicability due to massive requirement communication sampled data. Here we develop...
Deployment of Reinforcement Learning (RL) algorithms for robotics applications in the real world requires ensuring safety robot and its environment. Safe Robot RL (SRRL) is a crucial step toward achieving human-robot coexistence. In this paper, we envision human-centered SRRL framework consisting three stages: safe exploration, value alignment, collaboration. We examine research gaps these areas propose to leverage interactive behaviors SRRL. Interactive enable bi-directional information...
Formulating recommender system with reinforcement learning (RL) frameworks has attracted increasing attention from both academic and industry communities. While many promising results have been achieved, existing models mostly simulate the environment reward a unified value, which may hinder understanding of users' complex preferences limit model performance. In this paper, we consider how to user multi-aspect in context RL-based system. More specifically, base our on framework deterministic...
Recent studies have highlighted that deep neural networks (DNNs) are vulnerable to adversarial attacks, even in a black-box scenario. However, most of the existing attack algorithms need make huge amount queries perform which is not practical real world. We note one main reasons for massive example required be visually similar original image, but many cases, how examples look like does matter much. It inspires us introduce new called input-free attack, under an adversary can choose arbitrary...
Solving multi-goal reinforcement learning (RL) problems with sparse rewards is generally challenging. Existing approaches have utilized goal relabeling on collected experiences to alleviate issues raised from rewards. However, these methods are still limited in efficiency and cannot make full use of experiences. In this paper, we propose Model-based Hindsight Experience Replay (MHER), which exploits more efficiently by leveraging environmental dynamics generate virtual achieved goals....
Deep reinforcement learning provides a promising approach for text-based games in studying natural language communication between humans and artificial agents. However, the generalization still remains big challenge as agents depend critically on complexity variety of training tasks. In this paper, we address problem by introducing hierarchical framework built upon knowledge graph-based RL agent. high level, meta-policy is executed to decompose whole game into set subtasks specified textual...
Text-based games provide an interactive way to study natural language processing. While deep reinforcement learning has shown effectiveness in developing the game playing agent, low sample efficiency and large action space remain be two major challenges that hinder DRL from being applied real world. In this paper, we address by introducing world-perceiving modules, which automatically decompose tasks prune actions answering questions about environment. We then propose a two-phase training...
Safe reinforcement learning (RL) agents accomplish given tasks while adhering to specific constraints. Employing constraints expressed via easily-understandable human language offers considerable potential for real-world applications due its accessibility and non-reliance on domain expertise. Previous safe RL methods with natural typically adopt a recurrent neural network, which leads limited capabilities when dealing various forms of input. Furthermore, these often require ground-truth cost...
Graph-level clustering, which is essential in medical, biomedical, and social network data analysis, aims to group a set of graphs into various clusters. However, existing methods generally rely on single clustering criterion, e.g., $k$-means, limits their abilities fully exploit the complex Euclidean structural information inherent graphs. To bridge this gap, we propose dual contrastive graph-level (DCGLC) method paper. DCGLC leverages graph learning introduces Euclidian-based...