Andrey Kolobov

ORCID: 0000-0003-4966-7466
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Reinforcement Learning in Robotics
  • AI-based Problem Solving and Planning
  • Bayesian Modeling and Causal Inference
  • Optimization and Search Problems
  • Domain Adaptation and Few-Shot Learning
  • Aerospace and Aviation Technology
  • Advanced Bandit Algorithms Research
  • Caching and Content Delivery
  • Auction Theory and Applications
  • Machine Learning and Algorithms
  • Aerospace Engineering and Energy Systems
  • Robotic Path Planning Algorithms
  • Multi-Agent Systems and Negotiation
  • Constraint Satisfaction and Optimization
  • Mobile Crowdsensing and Crowdsourcing
  • Adversarial Robustness in Machine Learning
  • Guidance and Control Systems
  • Multimodal Machine Learning Applications
  • Advanced Database Systems and Queries
  • Robot Manipulation and Learning
  • Topic Modeling
  • Anomaly Detection Techniques and Applications
  • Artificial Intelligence in Games
  • Machine Learning and Data Classification
  • Fault Detection and Control Systems

Microsoft (United States)
2014-2024

Ivanovo State Power University
2023

Microsoft Research (United Kingdom)
2013-2021

Australian National University
2018

University of Washington
2009-2012

Seattle University
2010-2012

University of California, Berkeley
2005

Markov Decision Processes (MDPs) are widely popular in Artificial Intelligence for modeling sequential decision-making scenarios with probabilistic dynamics. They the framework of choice when designing an intelligent agent that needs to act long periods time environment where its actions could have uncertain outcomes. MDPs actively researched two related subareas AI, planning and reinforcement learning. Probabilistic assumes known models agent's goals domain dynamics, focuses on determining...

10.2200/s00426ed1v01y201206aim017 article EN Synthesis lectures on artificial intelligence and machine learning 2012-06-30

Abstract Real-time high-resolution wind predictions are beneficial for various applications including safe crewed and uncrewed aviation. Current weather models require too much compute lack the necessary predictive capabilities as they valid only at scale of multiple kilometers hours – lower spatial temporal resolutions than these require. Our work demonstrates ability to predict low-altitude time-averaged fields in real time on limited-compute devices, from sparse measurement data. We train...

10.1038/s41467-024-47778-4 article EN cc-by Nature Communications 2024-04-25

The traditional way of obtaining models from data, inductive learning, has proved itself both in theory and many practical applications. However, domains where data is difficult or expensive to obtain, e.g., medicine, deep transfer learning a more promising technique. It circumvents the model acquisition difficulties caused by scarce target domain carrying over structural properties learned source training ample. Nonetheless, lack principled view so far limited its adoption. In this paper,...

10.1609/aaai.v29i1.9624 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2015-02-21

In safety-critical applications, autonomous agents may need to learn in an environment where mistakes can be very costly. such settings, the agent needs behave safely not only after but also while learning. To achieve this, existing safe reinforcement learning methods make rely on priors that let it avoid dangerous situations during exploration with high probability, both probabilistic guarantees and smoothness assumptions inherent are viable many scenarios of interest as driving. This paper...

10.48550/arxiv.2006.12136 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Only a small percentage of blind and low-vision people use traditional mobility aids such as cane or guide dog. Various assistive technologies have been proposed to address the limitations aids. These devices often give either user device majority control. In this work, we explore how varying levels control affect users' sense agency, trust in device, confidence, successful navigation. We present Glide, novel aid with two modes for control: Glide-directed User-directed. employ Glide study...

10.1145/3568162.3578630 preprint EN 2023-03-09

In contrast to previous competitions, where the problems were goal-based, 2011 International Probabilistic Planning Competition (IPPC-2011) emphasized finite-horizon reward maximization with large branching factors. These MDPs modeled more realistic planning scenarios and presented challenges state-of-the-art planners (e.g., those from IPPC-2008), which primarily based on domain determinization — a technique suited goal-oriented small Moreover, factors render existing implementations of...

10.1609/icaps.v22i1.13523 article EN Proceedings of the International Conference on Automated Planning and Scheduling 2012-05-14

A Web crawler is an essential part of a search engine that procures information subsequently served by the to its users. As becoming increasingly more dynamic, in addition discovering new web pages needs keep revisiting those already engine's index, order index fresh picking up pages' changed content. Determining how often recrawl requires making tradeoffs based on relative importance and change rates, subject multiple resource constraints - limited daily budget crawl requests end politeness...

10.1145/3331184.3331241 article EN 2019-07-18

The results of the latest International Probabilistic Planning Competition (IPPC-2008) indicate that presence dead ends, states with no trajectory to goal, makes MDPs hard for modern probabilistic planners. Implicit executable actions but path are particularly challenging; existing MDP solvers spend much time and memory identifying these states. As a first attempt address this issue, we propose machine learning algorithm called SIXTHSENSE. SIXTHSENSE helps by finding nogoods, conjunctions...

10.1609/aaai.v24i1.7752 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2010-07-04

10.1016/j.artint.2012.05.002 article EN Artificial Intelligence 2012-05-15

Autonomous soaring capability has the potential to significantly increase time aloft for fixed-wing UAVs. In this paper, we introduce ArduSoar, first controller integrated into a major autopilot software suite small We describe ArduSoar from algorithmic standpoint, outline its integration with ArduPlane autopilot, discuss parameter tuning it, and conduct series of flight tests on real sUAVs that show ArduSoar's robustness even in highly nonideal atmospheric conditions.

10.1109/iros.2018.8593510 article EN 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2018-10-01

We provide a framework for accelerating reinforcement learning (RL) algorithms by heuristics constructed from domain knowledge or offline data. Tabula rasa RL require environment interactions computation that scales with the horizon of sequential decision-making task. Using our framework, we show how heuristic-guided induces much shorter-horizon subproblem provably solves original Our can be viewed as horizon-based regularization controlling bias and variance in under finite interaction...

10.48550/arxiv.2106.02757 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Small uninhabited aerial vehicles (sUAVs) commonly rely on active propulsion to stay airborne, which limits flight time and range.To address this, autonomous soaring seeks utilize free atmospheric energy in the form of updrafts (thermals).However, their irregular nature at low altitudes makes them hard exploit for existing methods.We model thermalling as a POMDP present recedinghorizon controller based it.We implement it part ArduPlane, popular open-source autopilot, compare an alternative...

10.15607/rss.2018.xiv.068 article EN 2018-06-26

Allocating tasks to workers so as get the greatest amount of high-quality output for little resources possible is an overarching theme in crowdsourcing research. Among factors that complicate this problem lack information about available workers’ skill, along with unknown difficulty be solved. Moreover, if a platform customer limited fixed-size worker pool complete large batch jobs such identifying particular object collection images or comparing quality many pairs artifacts workflows, she...

10.1609/hcomp.v1i1.13115 article EN Proceedings of the AAAI Conference on Human Computation and Crowdsourcing 2013-11-03

In many probabilistic planning scenarios, a system’s behavior needs to not only maximize the expected utility but also obey certain restrictions. This paper presents Saturated Path-Constrained Markov Decision Processes (SPC MDPs), new MDP type for under uncertainty with deterministic model-checking constraints, e.g., "state s must be visited befores s'", "the system end up in s", or never enter s". We present mathematical analysis of SPCMDPs, showing that although SPC MDPs generally have no...

10.1609/aaai.v28i1.9041 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2014-06-21

We study the potential of using large language models (LLMs) as an interactive optimizer for solving maximization problems in a text space natural and numerical feedback. Inspired by classical optimization literature, we classify feedback into directional non-directional, where former is generalization first-order to space. find that LLMs are especially capable when they provided with {directional feedback}. Based on this insight, design new LLM-based synthesizes from historical trace...

10.48550/arxiv.2405.16434 preprint EN arXiv (Cornell University) 2024-05-26
Coming Soon ...