NFDI4DS | UHH-SEMS - Publication Details

Deva Ramanan

ORCID: 0009-0008-9180-8983

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5004353237

Research Areas

Human Pose and Action Recognition
Advanced Neural Network Applications
Video Surveillance and Tracking Methods
Advanced Vision and Imaging
Domain Adaptation and Few-Shot Learning
Advanced Image and Video Retrieval Techniques
Multimodal Machine Learning Applications
Anomaly Detection Techniques and Applications
Autonomous Vehicle Technology and Safety
Robotics and Sensor-Based Localization
Computer Graphics and Visualization Techniques
Generative Adversarial Networks and Image Synthesis
3D Shape Modeling and Analysis
Visual Attention and Saliency Detection
Remote Sensing and LiDAR Applications
Face recognition and analysis
Video Analysis and Summarization
Machine Learning and Data Classification
Adversarial Robustness in Machine Learning
Hand Gesture Recognition Systems
Face and Expression Recognition
3D Surveying and Cultural Heritage
COVID-19 diagnosis using AI
Advanced Image Processing Techniques
Image Retrieval and Classification Techniques

Carnegie Mellon University
2015-2024

Perrigo (United States)
2019-2020

University of California, Irvine
2009-2017

UC Irvine Health
2008-2014

University of California System
2013

Toyota Technological Institute at Chicago
2005-2007

University of California, Berkeley
2003-2005

University of Delaware
2002

Object Detection with Discriminatively Trained Part-Based Models

OPENALEX - Publications

Pedro F. Felzenszwalb Ross Girshick David McAllester Deva Ramanan

We describe an object detection system based on mixtures of multiscale deformable part models. Our is able to represent highly variable classes and achieves state-of-the-art results in the PASCAL challenges. While models have become quite popular, their value had not been demonstrated difficult benchmarks such as data sets. relies new methods for discriminative training with partially labeled data. combine a margin-sensitive approach data-mining hard negative examples formalism we call...

10.1109/tpami.2009.167 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2009-09-24

A discriminatively trained, multiscale, deformable part model

OPENALEX - Publications

Pedro F. Felzenszwalb David McAllester Deva Ramanan

This paper describes a discriminatively trained, multiscale, deformable part model for object detection. Our system achieves two-fold improvement in average precision over the best performance 2006 PASCAL person detection challenge. It also outperforms results 2007 challenge ten out of twenty categories. The relies heavily on parts. While models have become quite popular, their value had not been demonstrated difficult benchmarks such as new methods discriminative training. We combine...

10.1109/cvpr.2008.4587597 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2008-06-01

Face detection, pose estimation, and landmark localization in the wild

OPENALEX - Publications

Xiangxin Zhu Deva Ramanan

We present a unified model for face detection, pose estimation, and landmark estimation in real-world, cluttered images. Our is based on mixtures of trees with shared pool parts; we every facial as part use global to capture topological changes due viewpoint. show that tree-structured models are surprisingly effective at capturing elastic deformation, while being easy optimize unlike dense graph structures. extensive results standard benchmarks, well new "in the wild" annotated dataset,...

10.1109/cvpr.2012.6248014 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2012-06-01

Microsoft COCO: Common Objects in Context

OPENALEX - Publications

Tsung-Yi Lin Michael Maire Serge Belongie Lubomir Bourdev Ross Girshick and 5 more

We present a new dataset with the goal of advancing state-of-the-art in object recognition by placing question context broader scene understanding. This is achieved gathering images complex everyday scenes containing common objects their natural context. Objects are labeled using per-instance segmentations to aid precise localization. Our contains photos 91 types that would be easily recognizable 4 year old. With total 2.5 million instances 328k images, creation our drew upon extensive crowd...

10.48550/arxiv.1405.0312 preprint EN other-oa arXiv (Cornell University) 2014-01-01

Argoverse: 3D Tracking and Forecasting With Rich Maps

OPENALEX - Publications

Ming-Fang Chang Deva Ramanan James Hays John Lambert Patsorn Sangkloy and 6 more

We present Argoverse, a dataset designed to support autonomous vehicle perception tasks including 3D tracking and motion forecasting. Argoverse includes sensor data collected by fleet of vehicles in Pittsburgh Miami as well annotations, 300k extracted interesting trajectories, rich semantic maps. The consists 360 degree images from 7 cameras with overlapping fields view, forward-facing stereo imagery, point clouds long range LiDAR, 6-DOF pose. Our 290km mapped lanes contain geometric...

10.1109/cvpr.2019.00895 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Articulated pose estimation with flexible mixtures-of-parts

OPENALEX - Publications

Yi Yang Deva Ramanan

We describe a method for human pose estimation in static images based on novel representation of part models. Notably, we do not use articulated limb parts, but rather capture orientation with mixture templates each part. general, flexible model capturing contextual co-occurrence relations between augmenting standard spring models that encode spatial relations. show such can notions local rigidity. When and are tree-structured, our be efficiently optimized dynamic programming. present...

10.1109/cvpr.2011.5995741 article EN 2011-06-01

Articulated Human Detection with Flexible Mixtures of Parts

OPENALEX - Publications

Yi Yang Deva Ramanan

We describe a method for articulated human detection and pose estimation in static images based on new representation of deformable part models. Rather than modeling articulation using family warped (rotated foreshortened) templates, we use mixture small, nonoriented parts. general, flexible model that jointly captures spatial relations between locations co-occurrence mixtures, augmenting standard pictorial structure models encode just relations. Our have several notable properties: 1) They...

10.1109/tpami.2012.261 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2012-12-12

Globally-optimal greedy algorithms for tracking a variable number of objects

OPENALEX - Publications

Hamed Pirsiavash Deva Ramanan Charless C. Fowlkes

We analyze the computational problem of multi-object tracking in video sequences. formulate using a cost function that requires estimating number tracks, as well their birth and death states. show global solution can be obtained with greedy algorithm sequentially instantiates tracks shortest path computations on flow network. Greedy algorithms allow one to embed pre-processing steps, such nonmax suppression, within algorithm. Furthermore, we give near-optimal based dynamic programming which...

10.1109/cvpr.2011.5995604 article EN 2011-06-01

Finding Tiny Faces

OPENALEX - Publications

Peiyun Hu Deva Ramanan

Though tremendous strides have been made in object recognition, one of the remaining open challenges is detecting small objects. We explore three aspects problem context finding faces: role scale invariance, image resolution, and contextual reasoning. While most recognition approaches aim to be scale-invariant, cues for recognizing a 3px tall face are fundamentally different than those 300px face. take approach train separate detectors scales. To maintain efficiency, trained multi-task...

10.1109/cvpr.2017.166 article EN 2017-07-01

A large-scale benchmark dataset for event recognition in surveillance video

OPENALEX - Publications

Sangmin Oh Anthony Hoogs A. G. Amitha Perera Naresh P. Cuntoor Chia-Chih Chen and 19 more

We introduce a new large-scale video dataset designed to assess the performance of diverse visual event recognition algorithms with focus on continuous (CVER) in outdoor areas wide coverage. Previous datasets for action are unrealistic real-world surveillance because they consist short clips showing one by individual [15, 8]. Datasets have been developed movies [11] and sports [12], but, these actions scene conditions do not apply effectively videos. Our consists many scenes occurring...

10.1109/cvpr.2011.5995586 article EN 2011-06-01

Detecting activities of daily living in first-person camera views

OPENALEX - Publications

Hamed Pirsiavash Deva Ramanan

We present a novel dataset and algorithms for the problem of detecting activities daily living (ADL) in firstperson camera views. have collected 1 million frames dozens people performing unscripted, everyday activities. The is annotated with activities, object tracks, hand positions, interaction events. ADLs differ from typical actions that they can involve long-scale temporal structure (making tea take few minutes) complex interactions (a fridge looks different when its door open). develop...

10.1109/cvpr.2012.6248010 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2012-06-01

3D Human Pose Estimation = 2D Pose Estimation + Matching

OPENALEX - Publications

Ching-Hang Chen Deva Ramanan

We explore 3D human pose estimation from a single RGB image. While many approaches try to directly predict image measurements, we simple architecture that reasons through intermediate 2D predictions. Our approach is based on two key observations (1) Deep neural nets have revolutionized estimation, producing accurate predictions even for poses with self-occlusions (2) Big-datasets of mocap data are now readily available, making it tempting lift predicted memorization (e.g., nearest...

10.1109/cvpr.2017.610 preprint EN 2017-07-01

Depth-supervised NeRF: Fewer Views and Faster Training for Free

OPENALEX - Publications

Kangle Deng Andrew Liu Jun-Yan Zhu Deva Ramanan

A commonly observed failure mode of Neural Radiance Field (NeRF) is fitting incorrect geometries when given an insufficient number input views. One potential reason that standard volumetric rendering does not enforce the constraint most a scene's geometry consist empty space and opaque surfaces. We formalize above assumption through DS-NeRF (Depth-supervised Fields), loss for learning radiance fields takes advantage readily-available depth supervision. leverage fact current NeRF pipelines...

10.1109/cvpr52688.2022.01254 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

ActionVLAD: Learning Spatio-Temporal Aggregation for Action Classification

OPENALEX - Publications

Rohit Girdhar Deva Ramanan Abhinav Gupta Josef Šivic Bryan Russell

In this work, we introduce a new video representation for action classification that aggregates local convolutional features across the entire spatio-temporal extent of video. We do so by integrating state-of-the-art two-stream networks [42] with learnable feature aggregation [6]. The resulting architecture is end-to-end trainable whole-video classification. investigate different strategies pooling space and time combining signals from streams. find that: (i) it important to pool jointly...

10.1109/cvpr.2017.337 preprint EN 2017-07-01

Efficiently Scaling up Crowdsourced Video Annotation

OPENALEX - Publications

Carl Vondrick Donald J. Patterson Deva Ramanan

10.1007/s11263-012-0564-1 article EN International Journal of Computer Vision 2012-09-04

Need for Speed: A Benchmark for Higher Frame Rate Object Tracking

OPENALEX - Publications

Hamed Kiani Galoogahi Ashton Fagg Chen Huang Deva Ramanan Simon Lucey

In this paper, we propose the first higher frame rate video dataset (called Need for Speed - NfS) and benchmark visual object tracking. The consists of 100 videos (380K frames) captured with now commonly available (240 FPS) cameras from real world scenarios. All frames are annotated axis aligned bounding boxes all sequences manually labelled nine attributes such as occlusion, fast motion, background clutter, etc. Our provides an extensive evaluation many recent state-of-the-art trackers on...

10.1109/iccv.2017.128 article EN 2017-10-01

Look and Think Twice: Capturing Top-Down Visual Attention with Feedback Convolutional Neural Networks

OPENALEX - Publications

Chunshui Cao Xianming Liu Shuicheng Yan Yinan Yu Jiang Wang and 7 more

While feedforward deep convolutional neural networks (CNNs) have been a great success in computer vision, it is important to note that the human visual cortex generally contains more feedback than connections. In this paper, we will briefly introduce background of feedbacks cortex, which motivates us develop computational mechanism networks. addition inference traditional networks, loop introduced infer activation status hidden layer neurons according "goal" network, e.g., high-level...

10.1109/iccv.2015.338 article EN 2015-12-01

Self-Paced Learning for Long-Term Tracking

OPENALEX - Publications

James Steven Supančič Deva Ramanan

We address the problem of long-term object tracking, where may become occluded or leave-the-view. In this setting, we show that an accurate appearance model is considerably more effective than a strong motion model. develop simple but algorithms alternate between tracking and learning good given track. it crucial to learn from "right" frames, use formalism self-paced curriculum automatically select such frames. leverage techniques detection for appearance-based templates, demonstrating...

10.1109/cvpr.2013.308 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2013-06-01

Meta-Learning to Detect Rare Objects

OPENALEX - Publications

Yu-Xiong Wang Deva Ramanan Martial Hebert

Few-shot learning, i.e., learning novel concepts from few examples, is fundamental to practical visual recognition systems. While most of existing work has focused on few-shot classification, we make a step towards object detection, more challenging yet under-explored task. We develop conceptually simple but powerful meta-learning based framework that simultaneously tackles classification and localization in unified, coherent way. This leverages meta-level knowledge about "model parameter...

10.1109/iccv.2019.01002 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

Discriminative models for multi-class object layout

OPENALEX - Publications

Chaitanya Desai Deva Ramanan Charless C. Fowlkes

Many state-of-the-art approaches for object recognition reduce the problem to a 0-1 classification task. Such reductions allow one leverage sophisticated classifiers learning. These models are typically trained independently each class using positive and negative examples cropped from images. At test-time, various post-processing heuristics such as non-maxima suppression (NMS) required reconcile multiple detections within between different classes image. Though crucial good performance on...

10.1109/iccv.2009.5459256 article EN 2009-09-01

Coming Soon ...