NFDI4DS | UHH-SEMS - Publication Details

Raviteja Vemulapalli

ORCID: 0000-0003-0425-7797

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5071825172

Research Areas

Advanced Neural Network Applications
Domain Adaptation and Few-Shot Learning
Advanced Image and Video Retrieval Techniques
Advanced Vision and Imaging
Multimodal Machine Learning Applications
Human Pose and Action Recognition
Face and Expression Recognition
Face recognition and analysis
Advanced Image Processing Techniques
Natural Language Processing Techniques
Hand Gesture Recognition Systems
Topic Modeling
Emotion and Mood Recognition
Image Retrieval and Classification Techniques
Gaze Tracking and Assistive Technology
Image and Signal Denoising Methods
Gait Recognition and Analysis
Image Processing Techniques and Applications
Advanced Computing and Algorithms
Advanced Image Fusion Techniques
Facial Nerve Paralysis Treatment and Research
Medical Image Segmentation Techniques
Speech and Audio Processing
Visual Attention and Saliency Detection
Digital Rights Management and Security

Apple (United Kingdom)
2024

Google (United States)
2018-2021

University of Maryland, College Park
2013-2017

Indian Institute of Technology Madras
2009

Human Action Recognition by Representing 3D Skeletons as Points in a Lie Group

OPENALEX - Publications

Raviteja Vemulapalli Felipe Arrate Rama Chellappa

Recently introduced cost-effective depth sensors coupled with the real-time skeleton estimation algorithm of Shotton et al. [16] have generated a renewed interest in skeleton-based human action recognition. Most existing approaches use either joint locations or angles to represent skeleton. In this paper, we propose new skeletal representation that explicitly models 3D geometric relationships between various body parts using rotations and translations space. Since rigid motions are members...

10.1109/cvpr.2014.82 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2014-06-01

Frame-Recurrent Video Super-Resolution

OPENALEX - Publications

Mehdi S. M. Sajjadi Raviteja Vemulapalli Matthew A. Brown

Recent advances in video super-resolution have shown that convolutional neural networks combined with motion compensation are able to merge information from multiple low-resolution (LR) frames generate high-quality images. Current state-of-the-art methods process a batch of LR single high-resolution (HR) frame and run this scheme sliding window fashion over the entire video, effectively treating problem as large number separate multi-frame tasks. This approach has two main weaknesses: 1)...

10.1109/cvpr.2018.00693 article EN 2018-06-01

Rolling Rotations for Recognizing Human Actions from 3D Skeletal Data

OPENALEX - Publications

Raviteja Vemulapalli Rama Chellappa

Recently, skeleton-based human action recognition has been receiving significant attention from various research communities due to the availability of depth sensors and real-time depth-based 3D skeleton estimation algorithms. In this work, we use rolling maps for recognizing actions skeletal data. The map is a well-defined mathematical concept that not explored much by vision community. First, represent each using relative rotations between body parts. Since are members special orthogonal...

10.1109/cvpr.2016.484 article EN 2016-06-01

Contrastive Learning for Label Efficient Semantic Segmentation

OPENALEX - Publications

Xiangyun Zhao Raviteja Vemulapalli P. Mansfield Boqing Gong Bradley Green and 2 more

Collecting labeled data for the task of semantic segmentation is expensive and time-consuming, as it requires dense pixel-level annotations. While recent Convolutional Neural Network (CNN) based approaches have achieved impressive results by using large amounts training data, their performance drops significantly amount decreases. This happens because deep CNNs trained with de facto cross-entropy loss can easily overfit to small data. To address this issue, we propose a simple effective...

10.1109/iccv48922.2021.01045 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding

OPENALEX - Publications

Haoxiang Wang Pavan Kumar Anasosalu Vasu Fartash Faghri Raviteja Vemulapalli Mehrdad Farajtabar and 4 more

10.1109/cvprw63382.2024.00367 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2024-06-17

Gaussian Conditional Random Field Network for Semantic Segmentation

OPENALEX - Publications

Raviteja Vemulapalli Oncel Tuzel Ming-Yu Liu Rama Chellappa

In contrast to the existing approaches that use discrete Conditional Random Field (CRF) models, we propose a Gaussian CRF model for task of semantic segmentation. We novel deep network, which refer as Mean (GMF) whose layers perform mean field inference over CRF. The proposed GMF network has desired property each its produces an output is closer maximum posteriori solution compared input. By combining with Convolutional Neural Networks (CNNs), new end-to-end trainable conditional random...

10.1109/cvpr.2016.351 article EN 2016-06-01

A Compact Embedding for Facial Expression Similarity

OPENALEX - Publications

Raviteja Vemulapalli Aseem Agarwala

Most of the existing work on automatic facial expression analysis focuses discrete emotion recognition, or action unit detection. However, expressions do not always fall neatly into pre-defined semantic categories. Also, similarity between measured in space need correspond to how humans perceive similarity. Different from previous work, our goal is describe a continuous fashion using compact embedding that mimics human visual preferences. To achieve this goal, we collect large-scale...

10.1109/cvpr.2019.00583 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Kernel Learning for Extrinsic Classification of Manifold Features

OPENALEX - Publications

Raviteja Vemulapalli Jaishanker K. Pillai Rama Chellappa

In computer vision applications, features often lie on Riemannian manifolds with known geometry. Popular learning algorithms such as discriminant analysis, partial least squares, support vector machines, etc., are not directly applicable to due the non-Euclidean nature of underlying spaces. Hence, classification is performed in an extrinsic manner by mapping Euclidean spaces using kernels. However, for kernel based approaches, poor choice results reduced performance. this paper, we address...

10.1109/cvpr.2013.233 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2013-06-01

Designing Deep Convolutional Neural Networks for Continuous Object Orientation Estimation

OPENALEX - Publications

Kota Hara Raviteja Vemulapalli Rama Chellappa

Deep Convolutional Neural Networks (DCNN) have been proven to be effective for various computer vision problems. In this work, we demonstrate its effectiveness on a continuous object orientation estimation task, which requires prediction of 0 360 degrees the objects. We do so by proposing and comparing three approaches designed DCNNs. The first two work representing an as point unit circle minimizing either L2 loss or angular difference loss. third method works converting task into set...

10.48550/arxiv.1702.01499 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Deep Gaussian Conditional Random Field Network: A Model-Based Deep Network for Discriminative Denoising

OPENALEX - Publications

Raviteja Vemulapalli Oncel Tuzel Ming-Yu Liu

We propose a novel end-to-end trainable deep network architecture for image denoising based on Gaussian Conditional Random Field (GCRF) model. In contrast to the existing discriminative methods that train separate model each individual noise level, proposed explicitly models input variance and hence is capable of handling range levels. Our network, which we refer as GCRF consists two sub-networks: (i) parameter generation generates pairwise potential parameters noisy image, (ii) an inference...

10.1109/cvpr.2016.519 preprint EN 2016-06-01

Search to Distill: Pearls Are Everywhere but Not the Eyes

OPENALEX - Publications

Yu Liu Xuhui Jia Mingxing Tan Raviteja Vemulapalli Yukun Zhu and 2 more

Standard Knowledge Distillation (KD) approaches distill the knowledge of a cumbersome teacher model into parameters student with pre-defined architecture. However, neural network, which is represented by network's output distribution conditioned on its input, depends not only but also Hence, more generalized approach for KD to teacher's both and architecture student. To achieve this, we present new \textit{Architecture-aware (AKD)} that finds models (pearls teacher) are best distilling given...

10.1109/cvpr42600.2020.00756 preprint EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Unsupervised Cross-Modal Synthesis of Subject-Specific Scans

OPENALEX - Publications

Raviteja Vemulapalli Hien Van Nguyen S. Kevin Zhou

Recently, cross-modal synthesis of subject-specific scans has been receiving significant attention from the medical imaging community. Though various approaches have introduced in recent past, most them are either tailored to a specific application or proposed for supervised setting, i.e., they assume availability training data same set subjects both source and target modalities. But, collecting multiple each subject is undesirable. Hence, address this issue, we propose general unsupervised...

10.1109/iccv.2015.79 article EN 2015-12-01

R3DG features: Relative 3D geometry-based skeletal representations for human action recognition

OPENALEX - Publications

Raviteja Vemulapalli Felipe Arrate Rama Chellappa

10.1016/j.cviu.2016.04.005 article EN Computer Vision and Image Understanding 2016-04-12

MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training

OPENALEX - Publications

Pavan Kumar Anasosalu Vasu Hadi Pouransari Fartash Faghri Raviteja Vemulapalli Oncel Tuzel

10.1109/cvpr52733.2024.01511 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

Riemannian Metric Learning for Symmetric Positive Definite Matrices

OPENALEX - Publications

Raviteja Vemulapalli David W. Jacobs

Over the past few years, symmetric positive definite (SPD) matrices have been receiving considerable attention from computer vision community. Though various distance measures proposed in for comparing SPD matrices, two most widely-used are affine-invariant and log-Euclidean distance. This is because these true geodesic distances induced by Riemannian geometry. In this work, we focus on geometry propose a data-driven approach learning metrics/geodesic matrices. We show that learned using...

10.48550/arxiv.1501.02393 preprint EN other-oa arXiv (Cornell University) 2015-01-01

Global Self-Attention Networks for Image Recognition

OPENALEX - Publications

Zhuoran Shen Irwan Bello Raviteja Vemulapalli Xuhui Jia Ching‐Hui Chen

Recently, a series of works in computer vision have shown promising results on various image and video understanding tasks using self-attention. However, due to the quadratic computational memory complexities self-attention, these either apply attention only low-resolution feature maps later stages deep network or restrict receptive field each layer small local region. To overcome limitations, this work introduces new global self-attention module, referred as GSA which is efficient enough...

10.48550/arxiv.2010.03019 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Probabilistic Speech-Driven 3D Facial Motion Synthesis: New Benchmarks, Methods, and Applications

OPENALEX - Publications

Karren Yang Anurag Ranjan Jen-Hao Rick Chang Raviteja Vemulapalli Oncel Tuzel

10.1109/cvpr52733.2024.02577 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding

OPENALEX - Publications

Haoxiang Wang Pavan Kumar Anasosalu Vasu Fartash Faghri Raviteja Vemulapalli Mehrdad Farajtabar and 4 more

The landscape of publicly available vision foundation models (VFMs), such as CLIP and Segment Anything Model (SAM), is expanding rapidly. VFMs are endowed with distinct capabilities stemming from their pre-training objectives. For instance, excels in semantic understanding, while SAM specializes spatial understanding for segmentation. In this work, we introduce a simple recipe to efficiently merge into unified model that absorbs expertise. Our method integrates techniques multi-task...

10.48550/arxiv.2310.15308 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Spatio-temporal nonparametric background modeling and subtraction

OPENALEX - Publications

Raviteja Vemulapalli R. Aravind

Background modeling and subtraction is a core component of many vision based systems. By far the most popular background models are per-pixel models, in which each pixel considered independently. Such fail to handle dynamic backgrounds noise. In this paper, we present solution problem by proposing novel computationally simple spatio-temporal model. We extend nonparametric model, one widely used from temporal domain domain. Instead individual pixels, consider 3 × blocks centered on use kernel...

10.1109/iccvw.2009.5457574 article EN 2009-09-01

Boosting Image-based Mutual Gaze Detection using Pseudo 3D Gaze

OPENALEX - Publications

Bardia Doosti Ching‐Hui Chen Raviteja Vemulapalli Xuhui Jia Yukun Zhu and 1 more

Mutual gaze detection, i.e., predicting whether or not two people are looking at each other, plays an important role in understanding human interactions. In this work, we focus on the task of image-based mutual and propose a simple effective approach to boost performance by using auxiliary 3D estimation during training phase. We achieve without additional labeling cost branch pseudo labels deduced from labels. By sharing head image encoder between detection branches, better features than...

10.1609/aaai.v35i2.16215 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2021-05-18

Deep Gaussian Conditional Random Field Network: A Model-based Deep Network for Discriminative Denoising

OPENALEX - Publications

Raviteja Vemulapalli Oncel Tuzel Ming-Yu Liu

We propose a novel deep network architecture for image\\ denoising based on Gaussian Conditional Random Field (GCRF) model. In contrast to the existing discriminative methods that train separate model each noise level, proposed explicitly models input variance and hence is capable of handling range levels. Our network, which we refer as GCRF consists two sub-networks: (i) parameter generation generates pairwise potential parameters noisy image, (ii) an inference whose layers perform...

10.48550/arxiv.1511.04067 preprint EN other-oa arXiv (Cornell University) 2015-01-01

Coming Soon ...