- Reinforcement Learning in Robotics
- Formal Methods in Verification
- Topic Modeling
- Data Stream Mining Techniques
- Machine Learning and Algorithms
- Receptor Mechanisms and Signaling
- Advanced Image and Video Retrieval Techniques
- Multimodal Machine Learning Applications
- Decision-Making and Behavioral Economics
- Domain Adaptation and Few-Shot Learning
- Smart Grid Energy Management
- IoT and Edge/Fog Computing
- Advanced Bandit Algorithms Research
- Software Engineering Research
- AI-based Problem Solving and Planning
- Artificial Intelligence in Games
- Machine Learning in Healthcare
- Mobile Crowdsensing and Crowdsourcing
- Adversarial Robustness in Machine Learning
- Multi-Agent Systems and Negotiation
- IoT Networks and Protocols
- Smart Parking Systems Research
- Evolutionary Algorithms and Applications
- Advanced Software Engineering Methodologies
- Simulation Techniques and Applications
Pontificia Universidad Católica de Chile
2022-2024
Vector Institute
2018-2022
University of Toronto
2017-2022
Samsung (United States)
2021
Reinforcement learning (RL) methods usually treat reward functions as black boxes. As such, these must extensively interact with the environment in order to discover rewards and optimal policies. In most RL applications, however, users have program function and, hence, there is opportunity make visible – show function’s code agent so it can exploit internal structure learn policies a more sample efficient manner. this paper, we how accomplish idea two steps. First, propose machines, type of...
In Reinforcement Learning (RL), an agent is guided by the rewards it receives from reward function. Unfortunately, may take many interactions with environment to learn sparse rewards, and can be challenging specify functions that reflect complex reward-worthy behavior. We propose using machines (RMs), which are automata-based representations expose function structure, as a normal form representation for functions. show how specifications of in various formal languages, including LTL other...
Reinforcement learning (RL) agents seek to maximize the cumulative reward obtained when interacting with their environment. Users define tasks or goals for RL by designing specialized functions such that maximization aligns task satisfaction. This work explores use of high-level symbolic action models as a framework defining final-state goal and automatically producing corresponding functions. We also show how automated planning can be used synthesize plans guide hierarchical (HRL)...
Due to the widespread use of mobile and IoT devices, coupled with their continually expanding processing capabilities, dew computing environments have become a significant focus for researchers. These enable resource-constrained devices contribute power local network. One major challenge within these revolves around task scheduling, specifically determining optimal distribution jobs across available in This becomes particularly pronounced dynamic where network conditions constantly change....
The knowledge representation community has built general-purpose ontologies which contain large amounts of commonsense over relevant aspects the world, including useful visual information, e.g.: "a ball is used by a football player", tennis player located at court". Current state-of-the-art approaches for recognition do not exploit these rule-based sources. Instead, they learn models directly from training examples. In this paper, we study how ontologies—specifically, MIT's ConceptNet...
Due to mobile and IoT devices’ ubiquity their ever-growing processing potential, Dew computing environments have been emerging topics for researchers. These allow resource-constrained devices contribute power others in a local network. One major challenge these is task scheduling: that is, how distribute jobs across available the In this paper, we propose using artificial intelligence (AI). Specifically, show an AI agent, known as Proximal Policy Optimization (PPO), can learn simulated...
Pluralistic alignment is concerned with ensuring that an AI system's objectives and behaviors are in harmony the diversity of human values perspectives. In this paper we study notion pluralistic context agentic AI, particular agent trying to learn a policy manner mindful perspective others environment. To end, show how being considerate future wellbeing agency other (human) agents can promote form alignment.
Sequence classification is the task of predicting a class label given sequence observations. In many applications such as healthcare monitoring or intrusion detection, early crucial to prompt intervention. this work, we learn classifiers that favour from an evolving observation trace. While state-of-the-art are neural networks, and in particular LSTMs, our take form finite state automata learned via discrete optimization. Our automata-based interpretable---supporting explanation,...
Reward Machines provide an automata-inspired structure for specifying instructions, safety constraints, and other temporally extended reward-worthy behaviour. By exposing complex reward function structure, they enable counterfactual learning updates that have resulted in impressive sample efficiency gains. While been employed both tabular deep RL settings, typically relied on a ground-truth interpretation of the domain-specific vocabulary form building blocks function. Such interpretations...
Many real-world reinforcement learning (RL) problems necessitate complex, temporally extended behavior that may only receive reward signal when the is completed. If reward-worthy known, it can be specified in terms of a non-Markovian function - depends on aspects state-action history, rather than just current state and action. Such functions yield sparse rewards, necessitating an inordinate number experiences to find policy captures pattern behavior. Recent work has leveraged Knowledge...
Human beings, even small children, quickly become adept at figuring out how to use applications on their mobile devices. Learning a new app is often achieved via trial-and-error, accelerated by transfer of knowledge from past experiences with like apps. The prospect building smarter smartphone â one that can learn achieve tasks using apps tantalizing. In this paper we explore the Reinforcement (RL) goal advancing aspiration. We introduce an RL-based framework for learning accomplish in RL...
Deep reinforcement learning has shown promise in discrete domains requiring complex reasoning, including games such as Chess, Go, and Hanabi. However, this type of reasoning is less often observed long-horizon, continuous with high-dimensional observations, where instead RL research predominantly focused on problems simple high-level structure (e.g. opening a drawer or moving robot fast possible). Inspired by combinatorially hard optimization problems, we propose set robotics tasks which...
Natural and formal languages provide an effective mechanism for humans to specify instructions reward functions. We investigate how generate policies via RL when functions are specified in a symbolic language captured by Reward Machines, increasingly popular automaton-inspired structure. interested the case where mapping of environment state (here, Machine) vocabulary -- commonly known as labelling function is uncertain from perspective agent. formulate problem policy learning Machines with...
We address the problem of teaching a deep reinforcement learning (RL) agent to follow instructions in multi-task environments. Instructions are expressed well-known formal language -- linear temporal logic (LTL) and can specify diversity complex, temporally extended behaviours, including conditionals alternative realizations. Our proposed approach exploits compositional syntax semantics LTL, enabling our RL learn task-conditioned policies that generalize new instructions, not observed during...
The knowledge representation community has built general-purpose ontologies which contain large amounts of commonsense over relevant aspects the world, including useful visual information, e.g.: "a ball is used by a football player", tennis player located at court". Current state-of-the-art approaches for recognition do not exploit these rule-based sources. Instead, they learn models directly from training examples. In this paper, we study how ontologies---specifically, MIT's ConceptNet...
In Real-Time Heuristic Search (RTHS) we are given a search graph G, heuristic, and the objective is to find path from start node goal in G. As such, one does not impose any trajectory constraints on path, besides reaching goal. this paper consider version of RTHS which temporally extended goals can be defined form path. Such specified Linear Temporal Logic over Finite Traces (LTLf), an expressive language that has been considered many other frameworks, such as Automated Planning, Synthesis,...
Reinforcement Learning (RL) agents typically learn memoryless policies---policies that only consider the last observation when selecting actions. policies is efficient and optimal in fully observable environments. However, some form of memory necessary RL are faced with partial observability. In this paper, we study a lightweight approach to tackle observability RL. We provide agent an external additional actions control what, if anything, written memory. At every step, current state part...