- Machine Learning and Algorithms
- Robotics and Sensor-Based Localization
- Reinforcement Learning in Robotics
- Machine Learning and Data Classification
- Gaussian Processes and Bayesian Inference
- Robotic Path Planning Algorithms
- Adversarial Robustness in Machine Learning
- Domain Adaptation and Few-Shot Learning
- Advanced Image and Video Retrieval Techniques
- Anomaly Detection Techniques and Applications
- Multimodal Machine Learning Applications
- Quantum Computing Algorithms and Architecture
- Data Stream Mining Techniques
- Human Pose and Action Recognition
- Advanced Vision and Imaging
- Advanced Bandit Algorithms Research
- Neural Networks and Applications
- Quantum Information and Cryptography
- Face and Expression Recognition
- Emotion and Mood Recognition
- Formal Methods in Verification
- Advanced Neural Network Applications
- Explainable Artificial Intelligence (XAI)
- Bayesian Modeling and Causal Inference
- Topic Modeling
Microsoft (United States)
2014-2024
Harcourt Butler Technical University
2024
Central Building Research Institute
2024
Microsoft Research (United Kingdom)
2009-2023
Synopsys (United States)
2023
North Carolina State University
2022
Seattle University
2020
University of California, Berkeley
2016
Mahindra Group (India)
2016
The University of Texas at Austin
2010
We propose a multi-sensor affect recognition system and evaluate it on the challenging task of classifying interest (or disinterest) in children trying to solve an educational puzzle computer. The multimodal sensory information from facial expressions postural shifts learner is combined with about learner's activity unified approach, based mixture Gaussian Processes, for achieving sensor fusion under problematic conditions missing channels noisy labels. This approach generates separate class...
Discriminative methods for visual object category recognition are typically non-probabilistic, predicting class labels but not directly providing an estimate of uncertainty. Gaussian Processes (GPs) powerful regression techniques with explicit uncertainty models; we show here how covariance functions defined based on a Pyramid Match Kernel (PMK) can be used probabilistic recognition. The model provided by GPs offers confidence estimates at test points, and naturally allows active learning...
We present AffectAura, an emotional prosthetic that allows users to reflect on their states over long periods of time. designed a multimodal sensor set-up for continuous logging audio, visual, physiological and contextual data, classification scheme predicting user affective state interface reflection. The system continuously predicts user's valence, arousal engage-ment, correlates this with information events, communications data interactions. evaluate the through study consisting six 240...
We present a challenging dataset, the TartanAir, for robot navigation tasks and more. The data is collected in photo-realistic simulation environments with presence of moving objects, changing light various weather conditions. By collecting simulations, we are able to obtain multi-modal sensor precise ground truth labels such as stereo RGB image, depth segmentation, optical flow, camera poses, LiDAR point cloud. set up large numbers styles scenes, covering viewpoints diverse motion patterns...
This paper presents an experimental study regarding the use of OpenAI's ChatGPT [1] for robotics applications. We outline a strategy that combines design principles prompt engineering and creation high-level function library which allows to adapt different tasks, simulators, form factors. focus our evaluations on effectiveness techniques dialog strategies towards execution various types tasks. explore ChatGPT's ability free-form dialog, parse XML tags, synthesize code, in addition...
Most state-of-the-art approaches for weather and climate modeling are based on physics-informed numerical models of the atmosphere. These aim to model non-linear dynamics complex interactions between multiple variables, which challenging approximate. Additionally, many such computationally intensive, especially when atmospheric phenomenon at a fine-grained spatial temporal resolution. Recent data-driven machine learning instead directly solve downstream forecasting or projection task by...
We survey applications of pretrained foundation models in robotics. Traditional deep learning robotics are trained on small datasets tailored for specific tasks, which limits their adaptability across diverse applications. In contrast, internet-scale data appear to have superior generalization capabilities, and some instances display an emergent ability find zero-shot solutions problems that not present the training data. Foundation may hold potential enhance various components robot...
Machine learning is an increasingly used computational tool within human-computer interaction research. While most researchers currently utilize iterative approach to refining classifier models and performance, we propose that ensemble classification techniques may be a viable even preferable alternative. In learning, algorithms combine multiple classifiers build one superior its components. this paper, present EnsembleMatrix, interactive visualization system presents graphical view of...
Discriminative methods for visual object category recognition are typically non-probabilistic, predicting class labels but not directly providing an estimate of uncertainty. Gaussian Processes (GPs) provide a framework deriving regression techniques with explicit uncertainty models; we show here how covariance functions defined based on Pyramid Match Kernel (PMK) can be used probabilistic recognition. Our formulation provides principled way to learn hyperparameters, which utilize optimal...
Web image search is difficult in part because a handful of keywords are generally insufficient for characterizing the visual properties an image. Popular engines have begun to provide tags based on simple characteristics images (such as black and white or that contain face), but such approaches limited by fact it unclear what end users want be able use examining results. This paper presents CueFlik, application allows quickly create their own rules re ranking characteristics. End can then...
Scarcity and infeasibility of human supervision for large scale multi-class classification problems necessitates active learning. Unfortunately, existing learning methods are inherently binary do not up to a number classes. In this paper, we introduce probabilistic variant the K-nearest neighbor method that can be seamlessly used in scenarios. Given some labeled training data, our learns an accurate metric/kernel function over input space similarity search. Unlike methods, scheme is highly...
We present quantum algorithms for performing nearest-neighbor learning and $k$--means clustering. At the core of our are fast coherent methods computing Euclidean distance both directly via inner product which we couple with amplitude estimation that do not require measurement. prove upper bounds on number queries to input data required compute such distances find nearest vector a given test example. In worst case, lead polynomial reductions in query complexity relative Monte Carlo...
Interest has been growing within HCI on the use of machine learning and reasoning in applications to classify such hidden states as user intentions, based observations. researchers with these interests typically have little expertise often employ toolkits relatively fixed "black boxes" for generating statistical classifiers. However, attempts tailor performance classifiers specific application requirements may require a more sophisticated understanding custom-tailoring methods. We present...
Drones equipped with cameras are emerging as a powerful tool for large-scale aerial 3D scanning, but existing automatic flight planners do not exploit all available information about the scene, and can therefore produce inaccurate incomplete models. We present an method to generate drone trajectories, such that imagery acquired during will later high-fidelity model. Our uses coarse estimate of scene geometry plan camera trajectories that: (1) cover thoroughly possible; (2) encourage...
Safe control of dynamical systems that satisfy temporal invariants expressing various safety properties is a challenging problem has drawn the attention many researchers.However, making assumption such are deterministic far from reality.For example, robotic system might employ camera sensor and machine learned to identify obstacles.Consequently, controller satisfy, will be function data associated classifier.We propose framework for achieving safe control.At heart our approach new...
We present quantum algorithms for performing nearest-neighbor learning and k-means clustering. At the core of our are fast coherent methods computing Euclidean distance both directly via inner product which we couple with amplitude estimation that do not require measurement. prove upper bounds on number queries to input data required compute such distances find nearest vector a given test example. In worst case, lead polynomial reductions in query complexity relative Monte Carlo algorithms....
The ability to detect and classify rare occurrences in images has important applications - for example, counting endangered species when studying biodiversity, or detecting infrequent traffic scenarios that pose a danger self-driving cars. Few-shot learning is an open problem: current computer vision systems struggle categorize objects they have seen only rarely during training, collecting sufficient number of training examples events often challenging expensive, sometimes outright...
Head nods and head shakes are non-verbal gestures used often to communicate intent, emotion perform conversational functions. We describe a vision-based system that detects in real time can act as useful basic interface machine. use an infrared sensitive camera equipped with LEDs track pupils. The directions of movements, determined using the position pupils, observations by discrete Hidden Markov Model (HMM) based pattern analyzer detect when nod/shake occurs. is trained tested on natural...
We provide a new fully automatic framework to analyze facial action units, the fundamental building blocks of expression enumerated in Paul Ekman's coding system (FACS). The units examined here include upper muscle movements such as inner eyebrow raise, eye widening, and so forth, which combine form expressions. Although prior methods have obtained high recognition rates for recognizing these either use manually preprocessed image sequences or require human specification features; thus, they...
It is often desirable to evaluate images quality with a perceptually relevant measure that does not require reference image. Recent approaches this problem use human provided scores machine learning learn measure. The biggest hurdles these efforts are: 1) the difficulty of generalizing across diverse types distortions and 2) collecting enormity scored training data needed We present new blind image addresses difficulties by robust, nonlinear kernel regression function using rectifier neural...
Machine learning requires an effective combination of data, features, and algorithms. While many tools exist for working with machine data algorithms, support thinking new or feature ideation, remains poor. In this paper, we investigate two general approaches to ideation: visual summaries sets errors. We present FeatureInsight, interactive analytics tool building dictionary features (semantically related groups words) text classification problems. FeatureInsight supports error-driven...