NFDI4DS | UHH-SEMS - Publication Details

Kashyap Chitta

ORCID: 0000-0002-3891-3230

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5041189908

Research Areas

Advanced Neural Network Applications
Autonomous Vehicle Technology and Safety
Domain Adaptation and Few-Shot Learning
Machine Learning and Algorithms
Machine Learning and Data Classification
Video Surveillance and Tracking Methods
Human Pose and Action Recognition
Multimodal Machine Learning Applications
Adversarial Robustness in Machine Learning
Generative Adversarial Networks and Image Synthesis
Advanced Image and Video Retrieval Techniques
Simulation Techniques and Applications
Advanced Image Processing Techniques
AI in cancer detection
Traffic Prediction and Management Techniques
Transportation and Mobility Innovations
Data Stream Mining Techniques
Image Retrieval and Classification Techniques
Robotics and Sensor-Based Localization
Emotion and Mood Recognition
Advanced Graph Neural Networks
Real-time simulation and control systems
Visual Attention and Saliency Detection
Robotic Path Planning Algorithms
Gaussian Processes and Bayesian Inference

TH Bingen University of Applied Sciences
2022-2024

University of Tübingen
2020-2022

Max Planck Institute for Intelligent Systems
2020-2022

Max Planck Society
2019-2021

Weatherford College
2021

Istituto Tecnico Industriale Alessandro Volta
2021

Nvidia (United States)
2021

Carnegie Mellon University
2018-2019

R.V. College of Engineering
2016

Multi-Modal Fusion Transformer for End-to-End Autonomous Driving

OPENALEX - Publications

Aditya Prakash Kashyap Chitta Andreas Geiger

How should representations from complementary sensors be integrated for autonomous driving? Geometry-based sensor fusion has shown great promise perception tasks such as object detection and motion forecasting. However, the actual driving task, global context of 3D scene is key, e.g. a change in traffic light state can affect behavior vehicle geometrically distant that light. Geometry alone may therefore insufficient effectively fusing end-to-end models. In this work, we demonstrate...

10.1109/cvpr46437.2021.00700 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

TransFuser: Imitation With Transformer-Based Sensor Fusion for Autonomous Driving

OPENALEX - Publications

Kashyap Chitta Aditya Prakash Bernhard Jaeger Zehao Yu Katrin Renz and 1 more

How should we integrate representations from complementary sensors for autonomous driving? Geometry-based fusion has shown promise perception (e.g., object detection, motion forecasting). However, in the context of end-to-end driving, find that imitation learning based on existing sensor methods underperforms complex driving scenarios with a high density dynamic agents. Therefore, propose TransFuser, mechanism to image and LiDAR using self-attention. Our approach uses transformer modules at...

10.1109/tpami.2022.3200245 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2022-08-19

NEAT: Neural Attention Fields for End-to-End Autonomous Driving

OPENALEX - Publications

Kashyap Chitta Aditya Prakash Andreas Geiger

Efficient reasoning about the semantic, spatial, and temporal structure of a scene is crucial prerequisite for autonomous driving. We present NEural ATtention fields (NEAT), novel representation that enables such end-to-end imitation learning models. NEAT continuous function which maps locations in Bird's Eye View (BEV) coordinates to waypoints semantics, using intermediate attention iteratively compress high-dimensional 2D image features into compact representation. This allows our model...

10.1109/iccv48922.2021.01550 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

End-to-End Autonomous Driving: Challenges and Frontiers

OPENALEX - Publications

Li Chen Penghao Wu Kashyap Chitta Bernhard Jaeger Andreas Geiger and 1 more

The autonomous driving community has witnessed a rapid growth in approaches that embrace an end-to-end algorithm framework, utilizing raw sensor input to generate vehicle motion plans, instead of concentrating on individual tasks such as detection and prediction. End-to-end systems, comparison modular pipelines, benefit from joint feature optimization for perception planning. This field flourished due the availability large-scale datasets, closed-loop evaluation, increasing need algorithms...

10.1109/tpami.2024.3435937 article EN cc-by-nc-nd IEEE Transactions on Pattern Analysis and Machine Intelligence 2024-07-30

Scalable Active Learning for Object Detection

OPENALEX - Publications

Elmar Haußmann Michele Fenzi Kashyap Chitta Jan Ivanecky Hanson Xu and 5 more

Deep Neural Networks trained in a fully supervised fashion are the dominant technology perception-based autonomous driving systems. While collecting large amounts of unlabeled data is already major undertaking, only subset it can be labeled by humans due to effort needed for high-quality annotation. Therefore, finding right label has become key challenge. Active learning powerful technique improve efficiency methods, as aims at selecting smallest possible training set reach required...

10.1109/iv47402.2020.9304793 article EN 2022 IEEE Intelligent Vehicles Symposium (IV) 2020-10-19

Projected GANs Converge Faster

OPENALEX - Publications

Axel Sauer Kashyap Chitta Jens Müller Andreas Geiger

Generative Adversarial Networks (GANs) produce high-quality images but are challenging to train. They need careful regularization, vast amounts of compute, and expensive hyper-parameter sweeps. We make significant headway on these issues by projecting generated real samples into a fixed, pretrained feature space. Motivated the finding that discriminator cannot fully exploit features from deeper layers model, we propose more effective strategy mixes across channels resolutions. Our Projected...

10.48550/arxiv.2111.01007 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Exploring Data Aggregation in Policy Learning for Vision-Based Urban Autonomous Driving

OPENALEX - Publications

Aditya Prakash Aseem Behl Eshed Ohn-Bar Kashyap Chitta Andreas Geiger

Data aggregation techniques can significantly improve vision-based policy learning within a training environment, e.g., to drive in specific simulation condition. However, as on-policy data is sequentially sampled and added an iterative manner, the specialize overfit conditions. For real-world applications, it useful for learned generalize novel scenarios that differ from To while maintaining robustness when end-to-end driving policies, we perform extensive analysis of CARLA environment. We...

10.1109/cvpr42600.2020.01178 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Learning Situational Driving

OPENALEX - Publications

Eshed Ohn-Bar Aditya Prakash Aseem Behl Kashyap Chitta Andreas Geiger

Human drivers have a remarkable ability to drive in diverse visual conditions and situations, e.g., from maneuvering rainy, limited visibility with no lane markings turning busy intersection while yielding pedestrians. In contrast, we find that state-of-the-art sensorimotor driving models struggle when encountering settings varying relationships between observation action. To generalize making decisions across conditions, humans leverage multiple types of situation-specific reasoning...

10.1109/cvpr42600.2020.01131 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Hidden Biases of End-to-End Driving Models

OPENALEX - Publications

Bernhard Jaeger Kashyap Chitta Andreas Geiger

End-to-end driving systems have recently made rapid progress, in particular on CARLA. Independent of their major contribution, they introduce changes to minor system components. Consequently, the source improvements is unclear. We identify two biases that recur nearly all state-of-the-art methods and are critical for observed progress CARLA: (1) lateral recovery via a strong inductive bias towards target point following, (2) longitudinal averaging multimodal waypoint predictions slowing...

10.1109/iccv51070.2023.00757 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

Parting with Misconceptions about Learning-based Vehicle Motion Planning

OPENALEX - Publications

Daniel Dauner Marcel Hallgarten Andreas Geiger Kashyap Chitta

The release of nuPlan marks a new era in vehicle motion planning research, offering the first large-scale real-world dataset and evaluation schemes requiring both precise short-term long-horizon ego-forecasting. Existing systems struggle to simultaneously meet requirements. Indeed, we find that these tasks are fundamentally misaligned should be addressed independently. We further assess current state closed-loop field, revealing limitations learning-based methods complex scenarios value...

10.48550/arxiv.2306.07962 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Generalized Predictive Model for Autonomous Driving

OPENALEX - Publications

Jiazhi Yang Shenyuan Gao Yihang Qiu Li Chen Tianyu Li and 9 more

10.1109/cvpr52733.2024.01389 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

PlanT: Explainable Planning Transformers via Object-Level Representations

OPENALEX - Publications

Katrin Renz Kashyap Chitta Otniel-Bogdan Mercea A. Sophia Koepke Zeynep Akata and 1 more

Planning an optimal route in a complex environment requires efficient reasoning about the surrounding scene. While human drivers prioritize important objects and ignore details not relevant to decision, learning-based planners typically extract features from dense, high-dimensional grid representations containing all vehicle road context information. In this paper, we propose PlanT, novel approach for planning of self-driving that uses standard transformer architecture. PlanT is based on...

10.48550/arxiv.2210.14222 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Large-Scale Visual Active Learning with Deep Probabilistic Ensembles

OPENALEX - Publications

Kashyap Chitta Jose M. Álvarez Adam Lesnikowski

Annotating the right data for training deep neural networks is an important challenge. Active learning using uncertainty estimates from Bayesian Neural Networks (BNNs) could provide effective solution to this. Despite being theoretically principled, BNNs require approximations be applied large-scale problems, where both performance and estimation are crucial. In this paper, we introduce Deep Probabilistic Ensembles (DPEs), a scalable technique that uses regularized ensemble approximate BNN....

10.48550/arxiv.1811.03575 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Label Efficient Visual Abstractions for Autonomous Driving

OPENALEX - Publications

Aseem Behl Kashyap Chitta Aditya Prakash Eshed Ohn-Bar Andreas Geiger

It is well known that semantic segmentation can be used as an effective intermediate representation for learning driving policies. However, the task of street scene requires expensive annotations. Furthermore, algorithms are often trained irrespective actual task, using auxiliary image-space loss functions which not guaranteed to maximize metrics such safety or distance traveled per intervention. In this work, we seek quantify impact reducing annotation costs on learned behavior cloning...

10.1109/iros45743.2020.9340641 article EN 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2020-10-24

Training Data Subset Search With Ensemble Active Learning

OPENALEX - Publications

Kashyap Chitta Jose M. Álvarez Elmar Haußmann Clément Farabet

Deep Neural Networks (DNNs) often rely on vast datasets for training. Given the large size of such datasets, it is conceivable that they contain specific samples either do not contribute or negatively impact DNN's optimization. Modifying training distribution to exclude could provide an effective solution improve performance and reduce time. This paper proposes scale up ensemble Active Learning (AL) methods perform acquisition at a (10k 500k time). We this with ensembles hundreds models,...

10.1109/tits.2021.3133268 article EN IEEE Transactions on Intelligent Transportation Systems 2021-12-31

End-to-end Autonomous Driving: Challenges and Frontiers

OPENALEX - Publications

Li Chen Penghao Wu Kashyap Chitta Bernhard Jaeger Andreas Geiger and 1 more

10.48550/arxiv.2306.16927 preprint EN cc-by-nc-sa arXiv (Cornell University) 2023-01-01

Adaptive Semantic Segmentation with a Strategic Curriculum of Proxy Labels

OPENALEX - Publications

Kashyap Chitta Jianwei Feng Martial Hebert

Training deep networks for semantic segmentation requires annotation of large amounts data, which can be time-consuming and expensive. Unfortunately, these trained still generalize poorly when tested in domains not consistent with the training data. In this paper, we show that by carefully presenting a mixture labeled source domain proxy-labeled target data to network, achieve state-of-the-art unsupervised adaptation results. With our design, network progressively learns features specific...

10.48550/arxiv.1811.03542 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Quadtree Generating Networks: Efficient Hierarchical Scene Parsing with Sparse Convolutions

OPENALEX - Publications

Kashyap Chitta Jose M. Álvarez Martial Hebert

Semantic segmentation with Convolutional Neural Networks is a memory-intensive task due to the high spatial resolution of feature maps and output predictions. In this paper, we present Quadtree Generating (QGNs), novel approach able drastically reduce memory footprint modern semantic networks. The key idea use quadtrees represent predictions target masks instead dense pixel grids. Our quadtree representation enables hierarchical processing an input image, most computationally demanding...

10.1109/wacv45572.2020.9093449 article EN 2020-03-01

SLEDGE: Synthesizing Simulation Environments for Driving Agents with Generative Models

OPENALEX - Publications

Kashyap Chitta Daniel Dauner Andreas Geiger

SLEDGE is the first generative simulator for vehicle motion planning trained on real-world driving logs. Its core component a learned model that able to generate agent bounding boxes and lane graphs. The model's outputs serve as an initial state traffic simulation. unique properties of entities be generated SLEDGE, such their connectivity variable count per scene, render naive application most modern models this task non-trivial. Therefore, together with systematic study existing graph...

10.48550/arxiv.2403.17933 preprint EN arXiv (Cornell University) 2024-03-26

Learning Sampling Policies for Domain Adaptation

OPENALEX - Publications

Yash Patel Kashyap Chitta Bhavan Jasani

We address the problem of semi-supervised domain adaptation classification algorithms through deep Q-learning. The core idea is to consider predictions a source network on target data as noisy labels, and learn policy sample from this so maximize accuracy small annotated reward partition domain. Our experiments show that learned sampling policies construct labeled sets improve accuracies visual classifiers over baselines.

10.48550/arxiv.1805.07641 preprint EN other-oa arXiv (Cornell University) 2018-01-01

A reduced region of interest based approach for facial expression recognition from static images

OPENALEX - Publications

Kashyap Chitta Neeraj N Sajjan

The general approach to facial expression recognition involves three stages: face acquisition, feature extraction and recognition. A series of steps are used during extraction, the robustness a model depends on ability handle exceptions over all these steps. This paper details experiments conducted classify images by using reduced regions interest discriminative salient patches face, while minimizing number required for their localization. performance various descriptors is analyzed which...

10.1109/tencon.2016.7848553 article EN 2016-11-01

Coming Soon ...