Rodrigo Toro Icarte

ORCID: 0000-0002-7734-099X
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Reinforcement Learning in Robotics
  • Formal Methods in Verification
  • Topic Modeling
  • Data Stream Mining Techniques
  • Machine Learning and Algorithms
  • Receptor Mechanisms and Signaling
  • Advanced Image and Video Retrieval Techniques
  • Multimodal Machine Learning Applications
  • Decision-Making and Behavioral Economics
  • Domain Adaptation and Few-Shot Learning
  • Smart Grid Energy Management
  • IoT and Edge/Fog Computing
  • Advanced Bandit Algorithms Research
  • Software Engineering Research
  • AI-based Problem Solving and Planning
  • Artificial Intelligence in Games
  • Machine Learning in Healthcare
  • Mobile Crowdsensing and Crowdsourcing
  • Adversarial Robustness in Machine Learning
  • Multi-Agent Systems and Negotiation
  • IoT Networks and Protocols
  • Smart Parking Systems Research
  • Evolutionary Algorithms and Applications
  • Advanced Software Engineering Methodologies
  • Simulation Techniques and Applications

Pontificia Universidad Católica de Chile
2022-2024

Vector Institute
2018-2022

University of Toronto
2017-2022

Samsung (United States)
2021

Reinforcement learning (RL) methods usually treat reward functions as black boxes. As such, these must extensively interact with the environment in order to discover rewards and optimal policies. In most RL applications, however, users have program function and, hence, there is opportunity make visible – show function’s code agent so it can exploit internal structure learn policies a more sample efficient manner. this paper, we how accomplish idea two steps. First, propose machines, type of...

10.1613/jair.1.12440 article EN cc-by Journal of Artificial Intelligence Research 2022-01-11

In Reinforcement Learning (RL), an agent is guided by the rewards it receives from reward function. Unfortunately, may take many interactions with environment to learn sparse rewards, and can be challenging specify functions that reflect complex reward-worthy behavior. We propose using machines (RMs), which are automata-based representations expose function structure, as a normal form representation for functions. show how specifications of in various formal languages, including LTL other...

10.24963/ijcai.2019/840 article EN 2019-07-28

Reinforcement learning (RL) agents seek to maximize the cumulative reward obtained when interacting with their environment. Users define tasks or goals for RL by designing specialized functions such that maximization aligns task satisfaction. This work explores use of high-level symbolic action models as a framework defining final-state goal and automatically producing corresponding functions. We also show how automated planning can be used synthesize plans guide hierarchical (HRL)...

10.1609/icaps.v30i1.6750 article EN Proceedings of the International Conference on Automated Planning and Scheduling 2020-06-01

Due to the widespread use of mobile and IoT devices, coupled with their continually expanding processing capabilities, dew computing environments have become a significant focus for researchers. These enable resource-constrained devices contribute power local network. One major challenge within these revolves around task scheduling, specifically determining optimal distribution jobs across available in This becomes particularly pronounced dynamic where network conditions constantly change....

10.3390/app14083206 article EN cc-by Applied Sciences 2024-04-11

The knowledge representation community has built general-purpose ontologies which contain large amounts of commonsense over relevant aspects the world, including useful visual information, e.g.: "a ball is used by a football player", tennis player located at court". Current state-of-the-art approaches for recognition do not exploit these rule-based sources. Instead, they learn models directly from training examples. In this paper, we study how ontologies—specifically, MIT's ConceptNet...

10.24963/ijcai.2017/178 article EN 2017-07-28

Due to mobile and IoT devices’ ubiquity their ever-growing processing potential, Dew computing environments have been emerging topics for researchers. These allow resource-constrained devices contribute power others in a local network. One major challenge these is task scheduling: that is, how distribute jobs across available the In this paper, we propose using artificial intelligence (AI). Specifically, show an AI agent, known as Proximal Policy Optimization (PPO), can learn simulated...

10.3390/app12147137 article EN cc-by Applied Sciences 2022-07-15

Pluralistic alignment is concerned with ensuring that an AI system's objectives and behaviors are in harmony the diversity of human values perspectives. In this paper we study notion pluralistic context agentic AI, particular agent trying to learn a policy manner mindful perspective others environment. To end, show how being considerate future wellbeing agency other (human) agents can promote form alignment.

10.48550/arxiv.2411.10613 preprint EN arXiv (Cornell University) 2024-11-15

Sequence classification is the task of predicting a class label given sequence observations. In many applications such as healthcare monitoring or intrusion detection, early crucial to prompt intervention. this work, we learn classifiers that favour from an evolving observation trace. While state-of-the-art are neural networks, and in particular LSTMs, our take form finite state automata learned via discrete optimization. Our automata-based interpretable---supporting explanation,...

10.1609/aaai.v35i11.17161 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2021-05-18

Reward Machines provide an automata-inspired structure for specifying instructions, safety constraints, and other temporally extended reward-worthy behaviour. By exposing complex reward function structure, they enable counterfactual learning updates that have resulted in impressive sample efficiency gains. While been employed both tabular deep RL settings, typically relied on a ground-truth interpretation of the domain-specific vocabulary form building blocks function. Such interpretations...

10.48550/arxiv.2406.00120 preprint EN arXiv (Cornell University) 2024-05-31

Many real-world reinforcement learning (RL) problems necessitate complex, temporally extended behavior that may only receive reward signal when the is completed. If reward-worthy known, it can be specified in terms of a non-Markovian function - depends on aspects state-action history, rather than just current state and action. Such functions yield sparse rewards, necessitating an inordinate number experiences to find policy captures pattern behavior. Recent work has leveraged Knowledge...

10.48550/arxiv.2301.02952 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Human beings, even small children, quickly become adept at figuring out how to use applications on their mobile devices. Learning a new app is often achieved via trial-and-error, accelerated by transfer of knowledge from past experiences with like apps. The prospect building smarter smartphone — one that can learn achieve tasks using apps tantalizing. In this paper we explore the Reinforcement (RL) goal advancing aspiration. We introduce an RL-based framework for learning accomplish in RL...

10.21428/594757db.e57f0d1e article EN cc-by 2021-06-08

Deep reinforcement learning has shown promise in discrete domains requiring complex reasoning, including games such as Chess, Go, and Hanabi. However, this type of reasoning is less often observed long-horizon, continuous with high-dimensional observations, where instead RL research predominantly focused on problems simple high-level structure (e.g. opening a drawer or moving robot fast possible). Inspired by combinatorially hard optimization problems, we propose set robotics tasks which...

10.48550/arxiv.2206.01812 preprint EN cc-by arXiv (Cornell University) 2022-01-01

Natural and formal languages provide an effective mechanism for humans to specify instructions reward functions. We investigate how generate policies via RL when functions are specified in a symbolic language captured by Reward Machines, increasingly popular automaton-inspired structure. interested the case where mapping of environment state (here, Machine) vocabulary -- commonly known as labelling function is uncertain from perspective agent. formulate problem policy learning Machines with...

10.48550/arxiv.2211.10902 preprint EN cc-by arXiv (Cornell University) 2022-01-01

We address the problem of teaching a deep reinforcement learning (RL) agent to follow instructions in multi-task environments. Instructions are expressed well-known formal language -- linear temporal logic (LTL) and can specify diversity complex, temporally extended behaviours, including conditionals alternative realizations. Our proposed approach exploits compositional syntax semantics LTL, enabling our RL learn task-conditioned policies that generalize new instructions, not observed during...

10.48550/arxiv.2102.06858 preprint EN cc-by arXiv (Cornell University) 2021-01-01

The knowledge representation community has built general-purpose ontologies which contain large amounts of commonsense over relevant aspects the world, including useful visual information, e.g.: "a ball is used by a football player", tennis player located at court". Current state-of-the-art approaches for recognition do not exploit these rule-based sources. Instead, they learn models directly from training examples. In this paper, we study how ontologies---specifically, MIT's ConceptNet...

10.48550/arxiv.1705.08844 preprint EN other-oa arXiv (Cornell University) 2017-01-01

In Real-Time Heuristic Search (RTHS) we are given a search graph G, heuristic, and the objective is to find path from start node goal in G. As such, one does not impose any trajectory constraints on path, besides reaching goal. this paper consider version of RTHS which temporally extended goals can be defined form path. Such specified Linear Temporal Logic over Finite Traces (LTLf), an expressive language that has been considered many other frameworks, such as Automated Planning, Synthesis,...

10.24963/ijcai.2022/663 article EN Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence 2022-07-01

Reinforcement Learning (RL) agents typically learn memoryless policies---policies that only consider the last observation when selecting actions. policies is efficient and optimal in fully observable environments. However, some form of memory necessary RL are faced with partial observability. In this paper, we study a lightweight approach to tackle observability RL. We provide agent an external additional actions control what, if anything, written memory. At every step, current state part...

10.48550/arxiv.2010.01753 preprint EN other-oa arXiv (Cornell University) 2020-01-01
Coming Soon ...