- Reinforcement Learning in Robotics
- Topic Modeling
- Natural Language Processing Techniques
- Machine Learning and Algorithms
- Speech and dialogue systems
- Adversarial Robustness in Machine Learning
- Advanced Bandit Algorithms Research
- Domain Adaptation and Few-Shot Learning
- Multimodal Machine Learning Applications
- Evolutionary Algorithms and Applications
- Machine Learning and Data Classification
- Model Reduction and Neural Networks
- Data Stream Mining Techniques
- Gaze Tracking and Assistive Technology
- Social Robot Interaction and HRI
- Bayesian Modeling and Causal Inference
- Speech Recognition and Synthesis
- Anomaly Detection Techniques and Applications
- Optimization and Search Problems
- EEG and Brain-Computer Interfaces
- Multi-Agent Systems and Negotiation
- Assistive Technology in Communication and Mobility
- Context-Aware Activity Recognition Systems
- Advanced Multi-Objective Optimization Algorithms
- Explainable Artificial Intelligence (XAI)
McGill University
2015-2024
Mila - Quebec Artificial Intelligence Institute
2019-2024
Alpha Omega Alpha Medical Honor Society
2022-2023
Menlo School
2022-2023
Meta (Israel)
2019-2022
Canadian Institute for Advanced Research
2017-2022
Polytechnique Montréal
2021
Centre Universitaire de Mila
2018-2020
Kord Technologies (United States)
2020
University of Liège
2019
This volume contains the papers accepted to 24th International Conference on Machine Learning (ICML 2007), which was held at Oregon State University in Corvalis, Oregon, from June 20th 24th, 2007. ICML is annual conference of Society (IMLS), and provides a venue for presentation discussion current research field machine learning. These proceedings can also be found online at: http://www.machinelearning.org. year there were 522 submissions ICML. There very thorough review process, each paper...
We investigate the task of building open domain, conversational dialogue systems based on large corpora using generative models. Generative models produce system responses that are autonomously generated word-by-word, opening up possibility for realistic, flexible interactions. In support this goal, we extend recently proposed hierarchical recurrent encoder-decoder neural network to and demonstrate model is competitive with state-of-the-art language back-off n-gram limitations similar...
This is an index to the papers that appear in Proceedings of 29th International Conference on Machine Learning (ICML-12). The conference was held Edinburgh, Scotland, June 27th - July 3rd, 2012.
In recent years, significant progress has been made in solving challenging problems across various domains using deep reinforcement learning (RL). Reproducing existing work and accurately judging the improvements offered by novel methods is vital to sustaining this progress. Unfortunately, reproducing results for state-of-the-art RL seldom straightforward. particular, non-determinism standard benchmark environments, combined with variance intrinsic methods, can make reported tough interpret....
We investigate evaluation metrics for dialogue response generation systems where supervised labels, such as task completion, are not available.Recent works in have adopted from machine translation to compare a model's generated single target response.We show that these correlate very weakly with human judgements the non-technical Twitter domain, and at all technical Ubuntu domain.We provide quantitative qualitative results highlighting specific weaknesses existing metrics, recommendations...
This degree work aims to explore the use of synthetic financial time series generated by a Generative Adversarial Neural Networks (GAN) model train Deep Reinforcement Learning algorithm that executes buy and sell actions for stock in Standard & Poor's 500 index.For implementation study, we used CRISP methodology proposed IBM, understanding first business theory necessary develop models, continue with exploration knowledge available data matched objectives project.In this paper, procedure...
This paper introduces the Ubuntu Dialogue Corpus, a dataset containing almost 1 million multi-turn dialogues, with total of over 7 utterances and 100 words.This provides unique resource for research into building dialogue managers based on neural language models that can make use large amounts unlabeled data.The has both property conversations in Dialog State Tracking Challenge datasets, unstructured nature interactions from microblog services such as Twitter.We also describe two learning...
Sequential data often possesses hierarchical structures with complex dependencies between sub-sequences, such as found the utterances in a dialogue. To model these generative framework, we propose neural network-based architecture, stochastic latent variables that span variable number of time steps. We apply proposed to task dialogue response generation and compare it other recent neural-network architectures. evaluate performance through human evaluation study. The experiments demonstrate...
We investigate evaluation metrics for dialogue response generation systems where supervised labels, such as task completion, are not available. Recent works in have adopted from machine translation to compare a model's generated single target response. show that these correlate very weakly with human judgements the non-technical Twitter domain, and at all technical Ubuntu domain. provide quantitative qualitative results highlighting specific weaknesses existing metrics, recommendations...
Partially Observable Markov Decision Processes (POMDPs) provide a rich framework for sequential decision-making under uncertainty in stochastic domains. However, solving POMDP is often intractable except small problems due to their complexity. Here, we focus on online approaches that alleviate the computational complexity by computing good local policies at each decision step during execution. Online algorithms generally consist of lookahead search find best action execute time an...
The Partially Observable Markov Decision Process has long been recognized as a rich framework for real-world planning and control problems, especially in robotics. However exact solutions this are typically computationally intractable all but the smallest problems. A well-known technique speeding up POMDP solving involves performing value backups at specific belief points, rather than over entire simplex. efficiency of approach, however, depends greatly on selection points. This paper...
Ryan Lowe, Michael Noseworthy, Iulian Vlad Serban, Nicolas Angelard-Gontier, Yoshua Bengio, Joelle Pineau. Proceedings of the 55th Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2017.
We present an approach to training neural networks generate sequences using actor-critic methods from reinforcement learning (RL). Current log-likelihood are limited by the discrepancy between their and testing modes, as models must tokens conditioned on previous guesses rather than ground-truth tokens. address this problem introducing a \textit{critic} network that is trained predict value of output token, given policy \textit{actor} network. This results in procedure much closer test...
Accurate reporting of energy and carbon usage is essential for understanding the potential climate impacts machine learning research. We introduce a framework that makes this easier by providing simple interface tracking realtime consumption emissions, as well generating standardized online appendices. Utilizing framework, we create leaderboard efficient reinforcement algorithms to incentivize responsible research in area an example other areas learning. Finally, based on case studies using...