- Explainable Artificial Intelligence (XAI)
- Reinforcement Learning in Robotics
- Remote-Sensing Image Classification
- Topic Modeling
- Neural Networks and Applications
- Auction Theory and Applications
- Human-Automation Interaction and Safety
- Experimental Behavioral Economics Studies
- Natural Language Processing Techniques
- Remote Sensing in Agriculture
- Robot Manipulation and Learning
- Domain Adaptation and Few-Shot Learning
- Ethics and Social Impacts of AI
- Multimodal Machine Learning Applications
- Mobile Crowdsensing and Crowdsourcing
- Decision-Making and Behavioral Economics
- Semantic Web and Ontologies
- Digital Transformation in Industry
- Machine Learning and Algorithms
- Free Will and Agency
- Speech and dialogue systems
- Constraint Satisfaction and Optimization
- Cognitive Science and Mapping
- Robotics and Automated Systems
- Species Distribution and Climate Change
IIT@MIT
2024
University of California, Berkeley
2023
Massachusetts Institute of Technology
2022
Microsoft (United States)
2020
Microsoft Research (United Kingdom)
2019-2020
Reinforcement learning from human feedback (RLHF) is a technique for training AI systems to align with goals. RLHF has emerged as the central method used finetune state-of-the-art large language models (LLMs). Despite this popularity, there been relatively little public work systematizing its flaws. In paper, we (1) survey open problems and fundamental limitations of related methods; (2) overview techniques understand, improve, complement in practice; (3) propose auditing disclosure...
To act in the world, robots rely on a representation of salient task aspects: for example, to carry coffee mug, robot may consider movement efficiency or mug orientation its behavior. However, if we want and with people, their representations must not be just functional but also reflective what humans care about, i.e. they aligned. We observe that current learning approaches suffer from misalignment, where robot's learned does capture human's representation. suggest because are ultimate...
Biological and artificial information processing systems form representations that they can use to categorize, reason, plan, navigate, make decisions. How we measure the extent which formed by these diverse agree? Do similarities in then translate into similar behavior? a system's be modified better match those of another system? These questions pertaining study representational alignment are at heart some most active research areas cognitive science, neuroscience, machine learning. For...
In AI-assisted decision-making, effective hybrid (human-AI) teamwork is not solely dependent on AI performance alone, but also its impact human decision-making. While prior work studies the effects of model accuracy humans, we endeavour here to investigate complex dynamics how both a model's predictive and bias may transfer humans in recommendation-aided decision task. We consider domain ML-assisted hiring, where humans---operating constrained selection setting---can choose whether they wish...
Learning from demonstrations is a common way for users to teach robots, but it prone spurious feature correlations. Recent work constructs state abstractions, i.e. visual representations containing task-relevant features, language as perform more generalizable learning. However, these abstractions also depend on user's preference what matters in task, which may be hard describe or infeasible exhaustively specify using alone. How do we construct capture latent preferences? We observe that how...
Although systematic biases in decision-making are widely documented, the ways which they emerge from different sources is less understood. We present a controlled experimental platform to study gender bias hiring by decoupling effect of world distribution (the breakdown candidates specific profession) human decision-making. explore effectiveness representation criteria, fixed proportional display candidates, as an intervention strategy for mitigation conducting experiments measuring...
We propose incorporating human labelers in a model fine-tuning system that provides immediate user feedback. In our framework, can interactively query predictions on unlabeled data, choose which data to label, and see the resulting effect model's predictions. This bi-directional feedback loop allows humans learn how responds new data. implement this framework for high-resolution land cover segmentation models compare human-selected points selected using standard active learning methods....
To act in the world, robots rely on a representation of salient task aspects: for example, to carry coffee mug, robot may consider movement efficiency or mug orientation its behavior. However, if we want and with people, their representations must not be just functional but also reflective what humans care about, i.e. they aligned. We observe that current learning approaches suffer from misalignment, where robot's learned does capture human's representation. suggest because are ultimate...
Although systematic biases in decision-making are widely documented, the ways which they emerge from different sources is less understood. We present a controlled experimental platform to study gender bias hiring by decoupling effect of world distribution (the breakdown candidates specific profession) human decision-making. explore effectiveness \textit{representation criteria}, fixed proportional display candidates, as an intervention strategy for mitigation conducting experiments measuring...
Learning from demonstrations is a common way for users to teach robots, but it prone spurious feature correlations. Recent work constructs state abstractions, i.e. visual representations containing task-relevant features, language as perform more generalizable learning. However, these abstractions also depend on user's preference what matters in task, which may be hard describe or infeasible exhaustively specify using alone. How do we construct capture latent preferences? We observe that how...
We describe a framework for using natural language to design state abstractions imitation learning. Generalizable policy learning in high-dimensional observation spaces is facilitated by well-designed representations, which can surface important features of an environment and hide irrelevant ones. These representations are typically manually specified, or derived from other labor-intensive labeling procedures. Our method, LGA (language-guided abstraction), uses combination supervision...
Humans use social context to specify preferences over behaviors, i.e. their reward functions. Yet, algorithms for inferring models from preference data do not take this learning view into account. Inspired by pragmatic human communication, we study how extract fine-grained regarding why an example is preferred that useful more accurate models. We propose enrich binary queries ask both (1) which features of a given are preferable in addition (2) comparisons between examples themselves. derive...
Computationally intensive decoding procedures--including search, reranking, and self-critique--can improve the quality of language model (LM) outputs in problems spanning code generation, numerical reasoning, dialog. Existing work typically applies same procedure for every input to an LM. But not all inputs require amount computation process. Can we allocate adaptively, using more resources answer questions whose answers will be harder compute? We present approach that predicts distribution...
Many approaches to robot learning begin by inferring a reward function from set of human demonstrations. To learn good reward, it is necessary determine which features the environment are relevant before determining how these should be used compute reward. End-to-end methods for joint feature and (e.g., using deep networks or program synthesis techniques) often yield brittle functions that sensitive spurious state features. By contrast, humans can generalizably small number demonstrations...
We introduce Constrained Human-AI Cooperation (CHAIC), an inclusive embodied social intelligence challenge designed to test perception and cooperation in agents. In CHAIC, the goal is for agent equipped with egocentric observations assist a human who may be operating under physical constraints -- e.g., unable reach high places or confined wheelchair performing common household outdoor tasks as efficiently possible. To achieve this, successful helper must: (1) infer human's intents by...
Policies often fail due to distribution shift -- changes in the state and reward that occur when a policy is deployed new environments. Data augmentation can increase robustness by making model invariant task-irrelevant agent's observation. However, designers don't know which concepts are irrelevant priori, especially different end users have preferences about how task performed. We propose an interactive framework leverage feedback directly from user identify personalized concepts. Our key...
Fair decision-making in criminal justice relies on the recognition and incorporation of infinite shades grey. In this paper, we detail how algorithmic risk assessment tools are counteractive to fair legal proceedings social institutions where desired states world contested ethically practically. We provide a normative framework for assessing judicial decision-making, one that does not seek elimination human bias from as fairness efforts currently focus on, but instead centers sophisticating...
AI's rapid growth has been felt acutely by scholarly venues, leading to growing pains within the peer review process. These challenges largely center on inability of specific subareas identify and evaluate work that is appropriate according criteria relevant each subcommunity as determined stakeholders subarea. We set forth a proposal re-focuses efforts these subcommunities through decentralization reviewing publication Through this re-centering effort, we hope encourage subarea confront...
As robots are increasingly deployed in real-world scenarios, a key question is how to best transfer knowledge learned one environment another, where shifting constraints and human preferences render adaptation challenging. A central challenge remains that often, it difficult (perhaps even impossible) capture the full complexity of deployment environment, therefore desired tasks, at training time. Consequently, representation, or abstraction, tasks hopes for robot perform may be misaligned...
Neural networks often learn task-specific latent representations that fail to generalize novel settings or tasks. Conversely, humans discrete (i.e., concepts words) at a variety of abstraction levels (e.g., "bird" vs. "sparrow") and deploy the appropriate based on task. Inspired by this, we train neural models generate spectrum representations, control complexity (roughly, how many bits are allocated for encoding inputs) tuning entropy distribution over representations. In finetuning...
We propose incorporating human labelers in a model fine-tuning system that provides immediate user feedback. In our framework, can interactively query predictions on unlabeled data, choose which data to label, and see the resulting effect model's predictions. This bi-directional feedback loop allows humans learn how responds new data. Our hypothesis is this rich create mental models enable them better biases introduce model. compare human-selected points selected using standard active...
In AI-assisted decision-making, effective hybrid (human-AI) teamwork is not solely dependent on AI performance alone, but also its impact human decision-making. While prior work studies the effects of model accuracy humans, we endeavour here to investigate complex dynamics how both a model's predictive and bias may transfer humans in recommendation-aided decision task. We consider domain ML-assisted hiring, where -- operating constrained selection setting can choose whether they wish utilize...