NFDI4DS | UHH-SEMS - Publication Details

Stephanie Milani

ORCID: 0000-0003-1150-4418

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5066795279

Research Areas

Reinforcement Learning in Robotics
Explainable Artificial Intelligence (XAI)
Adversarial Robustness in Machine Learning
Machine Learning and Data Classification
Robot Manipulation and Learning
Data Stream Mining Techniques
Topic Modeling
Ethics and Social Impacts of AI
AI-based Problem Solving and Planning
Statistical and Computational Modeling
Software Engineering Research
Machine Learning in Healthcare
Social Robot Interaction and HRI
Information and Cyber Security
Speech and dialogue systems
Machine Learning and Algorithms
Fuzzy Logic and Control Systems
Bayesian Modeling and Causal Inference
Evolutionary Algorithms and Applications
Evolutionary Game Theory and Cooperation
Neural Networks and Applications
Anomaly Detection Techniques and Applications
Botulinum Toxin and Related Neurological Disorders
Artificial Intelligence in Healthcare and Education
Temporomandibular Joint Disorders

Carnegie Mellon University
2019-2024

University of Maryland, Baltimore County
2019

University of Naples Federico II
2007

University of Milan
2007

The MineRL 2019 Competition on Sample Efficient Reinforcement Learning using Human Priors

OPENALEX - Publications

William H. Guss Mario Ynocente Castro Sam Devlin Brandon Houghton Noboru Sean Kuno and 10 more

Though deep reinforcement learning has led to breakthroughs in many difficult domains, these successes have required an ever-increasing number of samples. As state-of-the-art (RL) systems require exponentially increasing samples, their development is restricted a continually shrinking segment the AI community. Likewise, cannot be applied real-world problems, where environment samples are expensive. Resolution limitations requires new, sample-efficient methods. To facilitate research this...

10.48550/arxiv.1904.10079 preprint EN cc-by arXiv (Cornell University) 2019-01-01

A Survey of Explainable Reinforcement Learning

OPENALEX - Publications

Stephanie Milani Nicholay Topin Manuela Veloso Fei Fang

Explainable reinforcement learning (XRL) is an emerging subfield of explainable machine that has attracted considerable attention in recent years. The goal XRL to elucidate the decision-making process agents sequential settings. In this survey, we propose a novel taxonomy for organizing literature prioritizes RL setting. We overview techniques according taxonomy. point out gaps literature, which use motivate and outline roadmap future work.

10.48550/arxiv.2202.08434 preprint EN cc-by arXiv (Cornell University) 2022-01-01

Unilateral Posterior Crossbite is Not Associated with TMJ Clicking in Young Adolescents

OPENALEX - Publications

Mauro Farella Ambra Michelotti Giorgio Iodice Stephanie Milani R Martina

Unilateral posterior crossbite has been considered as a risk factor for temporomandibular joint clicking, with conflicting findings. The aim of this study was to investigate possible association between unilateral and disk displacement reduction, by means survey carried out in young adolescents recruited from three schools. sample included 1291 participants (708 males 583 females) mean age 12.3 yrs (range, 10.1–16.1 yrs), who underwent an orthodontic functional examination performed two...

10.1177/154405910708600206 article EN Journal of Dental Research 2007-02-01

Perceptions of Domestic Robots' Normative Behavior Across Cultures

OPENALEX - Publications

Huao Li Stephanie Milani Vigneshram Krishnamoorthy Michael Lewis Katia Sycara

As domestic service robots become more common and widespread, they must be programmed to efficiently accomplish tasks while aligning their actions with relevant norms. The first step equip normative reasoning competence is understanding the norms that people apply behavior of in specific social contexts. To end, we conducted an online survey Chinese United States participants which asked them select preferred action a robot should take number scenarios. paper makes multiple contributions....

10.1145/3306618.3314251 article EN 2019-01-27

Navigates Like Me: Understanding How People Evaluate Human-Like AI in Video Games

OPENALEX - Publications

Stephanie Milani Arthur Juliani Ida Momennejad Raluca Georgescu Jaroslaw Rzepecki and 5 more

We aim to understand how people assess human likeness in navigation produced by and artificially intelligent (AI) agents a video game. To this end, we propose novel AI agent with the goal of generating more human-like behavior. collect hundreds crowd-sourced assessments comparing human-likeness behavior generated our baseline human-generated Our proposed passes Turing Test, while do not. By passing mean that judges could not quantitatively distinguish between videos person an navigating....

10.1145/3544548.3581348 preprint EN 2023-04-19

Iterative Bounding MDPs: Learning Interpretable Policies via Non-Interpretable Methods

OPENALEX - Publications

Nicholay Topin Stephanie Milani Fei Fang Manuela Veloso

Current work in explainable reinforcement learning generally produces policies the form of a decision tree over state space. Such can be used for formal safety verification, agent behavior prediction, and manual inspection important features. However, existing approaches fit after training or use custom procedure which is not compatible with new techniques, such as those neural networks. To address this limitation, we propose novel Markov Decision Process (MDP) type policies: Iterative...

10.1609/aaai.v35i11.17192 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2021-05-18

Retrospective Analysis of the 2019 MineRL Competition on Sample Efficient Reinforcement Learning

OPENALEX - Publications

Stephanie Milani Nicholay Topin Brandon Houghton William H. Guss Sharada P. Mohanty and 3 more

To facilitate research in the direction of sample efficient reinforcement learning, we held MineRL Competition on Sample Efficient Reinforcement Learning Using Human Priors at Thirty-third Conference Neural Information Processing Systems (NeurIPS 2019). The primary goal this competition was to promote development algorithms that use human demonstrations alongside learning reduce number samples needed solve complex, hierarchical, and sparse environments. We describe competition, outlining...

10.48550/arxiv.2003.05012 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Planning with Abstract Learned Models While Learning Transferable Subtasks

OPENALEX - Publications

John Winder Stephanie Milani Matthew Landen Erebus Oh Shane Parr and 3 more

We introduce an algorithm for model-based hierarchical reinforcement learning to acquire self-contained transition and reward models suitable probabilistic planning at multiple levels of abstraction. call this framework Planning with Abstract Learned Models (PALM). By representing subtasks symbolically using a new formal structure, the lifted abstract Markov decision process (L-AMDP), PALM learns that are independent modular. Through our experiments, we show how integrates execution,...

10.1609/aaai.v34i06.6555 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2020-04-03

Towards Solving Fuzzy Tasks with Human Feedback: A Retrospective of the MineRL BASALT 2022 Competition

OPENALEX - Publications

Stephanie Milani Anssi Kanervisto Karolis Ramanauskas Sander Schulhoff Brandon Houghton and 25 more

To facilitate research in the direction of fine-tuning foundation models from human feedback, we held MineRL BASALT Competition on Fine-Tuning Human Feedback at NeurIPS 2022. The challenge asks teams to compete develop algorithms solve tasks with hard-to-specify reward functions Minecraft. Through this competition, aimed promote development that use feedback as channels learn desired behavior. We describe competition and provide an overview top solutions. conclude by discussing impact future...

10.48550/arxiv.2303.13512 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Bi-level Latent Variable Model for Sample-Efficient Multi-Agent Reinforcement Learning

OPENALEX - Publications

Aravind Venugopal Stephanie Milani Fei Fang Balaraman Ravindran

Multi-agent reinforcement learning (MARL) methods often suffer from high sample complexity, limiting their use in real-world problems where data is sparse or expensive to collect. Although latent-variable world models have been employed address this issue by generating abundant synthetic for MARL training, most of these cannot encode vital global information available during training into latent states, which hampers efficiency. The few exceptions that incorporate assume centralized...

10.48550/arxiv.2304.06011 preprint EN cc-by arXiv (Cornell University) 2023-01-01

The MineRL BASALT Competition on Learning from Human Feedback

OPENALEX - Publications

Rohin Shah Cody Wild Steven H. Wang Neel Alex Brandon Houghton and 8 more

The last decade has seen a significant increase of interest in deep learning research, with many public successes that have demonstrated its potential. As such, these systems are now being incorporated into commercial products. With this comes an additional challenge: how can we build AI solve tasks where there is not crisp, well-defined specification? While multiple solutions been proposed, competition focus on one particular: from human feedback. Rather than training using predefined...

10.48550/arxiv.2107.01969 preprint EN other-oa arXiv (Cornell University) 2021-01-01

MineRL Diamond 2021 Competition: Overview, Results, and Lessons Learned

OPENALEX - Publications

Anssi Kanervisto Stephanie Milani Karolis Ramanauskas Nicholay Topin Zichuan Lin and 17 more

Reinforcement learning competitions advance the field by providing appropriate scope and support to develop solutions toward a specific problem. To promote development of more broadly applicable methods, organizers need enforce use general techniques, sample-efficient reproducibility results. While beneficial for research community, these restrictions come at cost -- increased difficulty. If barrier entry is too high, many potential participants are demoralized. With this in mind, we hosted...

10.48550/arxiv.2202.10583 preprint EN other-oa arXiv (Cornell University) 2022-01-01

How Humans Perceive Human-like Behavior in Video Game Navigation

OPENALEX - Publications

Evelyn Zuniga Stephanie Milani Guy Leroy Jarosław Rzepecki Raluca Georgescu and 8 more

The goal of this paper is to understand how people assess human-likeness in human- and AI-generated behavior. To end, we present a qualitative study hundreds crowd-sourced assessments behavior 3D video game navigation task. In particular, focus on an AI agent that has passed Turing Test, the sense human judges were not able reliably distinguish between videos navigating quantitative level. Our insights shine light characteristics consider as human-like. Understanding these key first step for...

10.1145/3491101.3519735 article EN CHI Conference on Human Factors in Computing Systems Extended Abstracts 2022-04-27

Unifying Interpretability and Explainability for Alzheimer's Disease Progression Prediction

OPENALEX - Publications

Raja Farrukh Ali Stephanie Milani John Woods Emmanuel Adenij Ayesha Farooq and 3 more

Reinforcement learning (RL) has recently shown promise in predicting Alzheimer's disease (AD) progression due to its unique ability model domain knowledge. However, it is not clear which RL algorithms are well-suited for this task. Furthermore, these methods inherently explainable, limiting their applicability real-world clinical scenarios. Our work addresses two important questions. Using a causal, interpretable of AD, we first compare the performance four contemporary brain cognition over...

10.48550/arxiv.2406.07777 preprint EN arXiv (Cornell University) 2024-06-11

Interpretability in Action: Exploratory Analysis of VPT, a Minecraft Agent

OPENALEX - Publications

Karolis Jucys George Adamopoulos Mehrab Hamidi Stephanie Milani Mohammad Reza Samsami and 5 more

Understanding the mechanisms behind decisions taken by large foundation models in sequential decision making tasks is critical to ensuring that such systems operate transparently and safely. In this work, we perform exploratory analysis on Video PreTraining (VPT) Minecraft playing agent, one of largest open-source vision-based agents. We aim illuminate its reasoning applying various interpretability techniques. First, analyze attention mechanism while agent solves training task - crafting a...

10.48550/arxiv.2407.12161 preprint EN arXiv (Cornell University) 2024-07-16

Concept-Based Interpretable Reinforcement Learning with Limited to No Human Labels

OPENALEX - Publications

Zhuorui Ye Stephanie Milani Geoffrey J. Gordon Fei Fang

Recent advances in reinforcement learning (RL) have predominantly leveraged neural network-based policies for decision-making, yet these models often lack interpretability, posing challenges stakeholder comprehension and trust. Concept bottleneck offer an interpretable alternative by integrating human-understandable concepts into networks. However, a significant limitation prior work is the assumption that human annotations are readily available during training, necessitating continuous...

10.48550/arxiv.2407.15786 preprint EN arXiv (Cornell University) 2024-07-22

Guaranteeing Reproducibility in Deep Learning Competitions

OPENALEX - Publications

Brandon Houghton Stephanie Milani Nicholay Topin William H. Guss Katja Hofmann and 3 more

To encourage the development of methods with reproducible and robust training behavior, we propose a challenge paradigm where competitors are evaluated directly on performance their learning procedures rather than pre-trained agents. Since competition organizers re-train proposed in controlled setting they can guarantee reproducibility, -- by retraining submissions using held-out test set help ensure generalization past environments which were trained.

10.48550/arxiv.2005.06041 preprint EN other-oa arXiv (Cornell University) 2020-01-01

BEDD: The MineRL BASALT Evaluation and Demonstrations Dataset for Training and Benchmarking Agents that Solve Fuzzy Tasks

OPENALEX - Publications

Stephanie Milani Anssi Kanervisto Karolis Ramanauskas Sander Schulhoff Brandon Houghton and 1 more

The MineRL BASALT competition has served to catalyze advances in learning from human feedback through four hard-to-specify tasks Minecraft, such as create and photograph a waterfall. Given the completion of two years competitions, we offer community formalized benchmark Evaluation Demonstrations Dataset (BEDD), which serves resource for algorithm development performance assessment. BEDD consists collection 26 million image-action pairs nearly 14,000 videos players completing Minecraft. It...

10.48550/arxiv.2312.02405 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Coming Soon ...