- Advanced Vision and Imaging
- Image Enhancement Techniques
- Advanced Image and Video Retrieval Techniques
- Advanced Image Processing Techniques
- Visual Attention and Saliency Detection
- Video Surveillance and Tracking Methods
- Image Processing Techniques and Applications
- Human Pose and Action Recognition
- Advanced Neural Network Applications
- Computer Graphics and Visualization Techniques
- 3D Shape Modeling and Analysis
- Natural Language Processing Techniques
- Robotics and Sensor-Based Localization
- Video Analysis and Summarization
- Generative Adversarial Networks and Image Synthesis
- Advanced Image Fusion Techniques
- Image and Signal Denoising Methods
- Image Retrieval and Classification Techniques
- Hand Gesture Recognition Systems
- Medical Image Segmentation Techniques
- Multimodal Machine Learning Applications
- Topic Modeling
- Image and Video Quality Assessment
- Image Processing and 3D Reconstruction
- Anomaly Detection Techniques and Applications
Indian Institute of Technology Gandhinagar
2016-2025
GITAM University
2015
Dr. Hari Singh Gour University
2012
Indian Institute of Technology Bombay
1976-2011
Indian Institute of Technology Madras
1995-2009
Lawrence Berkeley National Laboratory
2005-2007
Indian Institute of Technology Kanpur
2005
Motorola (United States)
1995-2005
Indian Institute of Technology Guwahati
2004
University of Illinois Urbana-Champaign
1991-2002
Compositing a scene from multiple images is of considerableinterest to graphics professionals. Typical compositing techniques involve estimation or explicit prepar ation matte by an artist. In this article, we address the problem automatic o btained through variable exposure photography. We consider High Dynamic Range Imaging (HDRI) d review some existing approaches for directly generating Low (LDR) image mul ti-exposure images. propose computationally efficient method using edge-prese rving...
Removing blur caused by camera shake in images has always been a challenging problem computer vision literature due to its ill-posed nature. Motion the relative motion between and object 3D space induces spatially varying blurring effect over entire image. In this paper, we propose novel deep filter based on Generative Adversarial Network (GAN) architecture integrated with global skip connection dense order tackle problem. Our model, while bypassing process of kernel estimation,...
Human pose estimation is a well-known problem in computer vision to locate joint positions. Existing datasets for learning of poses are observed be not challenging enough terms diversity, object occlusion and view points. This makes the annotation process relatively simple restricts application models that have been trained on them. To handle more variety human poses, we propose concept fine-grained hierarchical classification, which formulate as classification task, dataset, Yoga-82,...
Traditional 3D Convolutional Neural Networks (CNNs) are computationally expensive, memory intensive, prone to overfit, and most importantly, there is a need improve their feature learning capabilities. To address these issues, we propose Rectified Local Phase Volume (ReLPV) block, an efficient alternative the standard convolutional layer. The ReLPV block extracts phase in local neighborhood (e.g., 3 × 3) of each position input map obtain maps. extracted by computing Short Term Fourier...
We have developed a convolutional neural network for the purpose of recognizing facial expressions in human beings. fine-tuned existing model trained on visual recognition dataset used ILSVRC2012 to two widely expression datasets - CFEE and RaFD, which when tested independently yielded test accuracies 74.79% 95.71%, respectively. Generalization results was evident by training one testing other. Further, image product cropped faces their saliency maps were computed using Deep Multi-Layer...
Conventional 3D convolutional neural networks (CNNs) are computationally expensive, memory intensive, prone to overfitting, and most importantly, there is a need improve their feature learning capabilities. To address these issues, we propose spatio-temporal short term Fourier transform (STFT) blocks, new class of blocks that can serve as an alternative the layer its variants in CNNs. An STFT block consists non-trainable convolution layers capture spatially and/or temporally local...
Reconstructing images using brain signals of imagined visuals may provide an augmented vision to the disabled, leading advancement Brain-Computer Interface (BCI) technology. The recent progress in deep learning has boosted study area synthesizing from Generative Adversarial Networks (GAN). In this work, we have proposed a framework for activity recorded by electroencephalogram (EEG) small-size EEG datasets. This is subject's head scalp when they ask visualize certain classes Objects and...
High dynamic range (HDR) image generation from a single exposure low (LDR) has been made possible due to the recent advances in Deep Learning. Various feed-forward Convolutional Neural Networks (CNNs) have proposed for learning LDR HDR representations. To better utilize power of CNNs, we exploit idea feedback, where initial level features are guided by high using hidden state Recurrent Network. Unlike forward pass conventional network, reconstruction feedback network is learned over multiple...
Recognizing facial expressions is one of the central problems in computer vision. Temporal image sequences have useful spatio-temporal features for recognizing expressions. In this paper, we propose a new 3D Convolution Neural Network (CNN) that can be trained end-to-end expression recognition on temporal without using landmarks. More specifically, novel convolutional layer call Local Binary Volume (LBV) proposed. The LBV layer, when used with our newly proposed LBVCNN network, achieve...
Decoding the human brain has been a hallmark of neuroscientists and Artificial Intelligence researchers alike. Reconstruction visual images from Electroencephalography (EEG) signals garnered lot interest due to its applications in brain-computer interfacing. This study proposes two-stage method where first step is obtain EEG-derived features for robust learning deep representations subsequently utilize learned representation image generation classification. We demonstrate generalizability...
Competitive diving is a well recognized aquatic sport in which person dives from platform or springboard into the water. Based on acrobatics performed during dive, classified finite set of action classes are standardized by FINA. In this work, we propose an attention guided LSTM-based neural network architecture for task classification. The takes frames video as input and determines its class. We evaluate performance proposed model recently introduced competitive dataset, Diving48. It...
We present a novel coarser-to-finer approach for deep graphical image inpainting that utilizes GraphFill, graph neural network-based learning framework, and lightweight generative baseline network. construct pyramidal the input-masked by reducing it into superpixels, each representing node in graph. The proposed facilitates transfer of global context from coarser to finer pyramid levels, enabling GraphFill estimate plausible information unknown values estimated is used fill masked region,...
Image matting is an important problem in computational photography. Although, it has been studied for more than two decades, yet there a challenge of developing automatic algorithm which does not require any human intervention. Most the state-of-the-art algorithms intervention form trimap or scribbles to generate alpha matte input image. In this paper, we present simple and efficient approach automatically from image make whole process free human-in-the-loop. We use learning based method...
We propose an algorithm to detect approximate reflection symmetry present in a set of volumetrically distributed points belonging ℝ <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">d</sup> containing distorted pattern. pose the problem detecting as establishing correspondences between which are reflections each other and we determine transformation. formulate optimization framework amounts solving linear assignment determining transformation on...
While 2D pose estimation has advanced our ability to interpret body movements in animals and primates, it is limited by the lack of depth information, constraining its application range. 3D provides a more comprehensive solution incorporating spatial depth, yet creating extensive datasets for challenging due their dynamic unpredictable behaviours natural settings. To address this, we propose hybrid approach that utilizes rigged avatars pipeline generate synthetic acquire necessary...
Abstract Pollen grains of plant species have unique morphological characteristics. The variability in shape, size, and microscopic pollen surface features can be efficiently used to determine the which they belong. This approach instrumental regions with rich biodiversity species, specifically medicinal plant. creation a dataset for these using SEM images computer vision application beneficial their identification. We developed robust utilizing scanning electron microscopy (SEM) generate...
Recent advancements in learning-based methods have opened new avenues for exploring and interpreting art forms, such as shadow art, origami, sketch through computational models. One notable visual form is 3D Anamorphic Art which an ensemble of arbitrarily shaped objects creates a realistic meaningful expression when observed from particular viewpoint loses its coherence over the other viewpoints. In this work, we build on insights to perform object arrangement. We introduce RASP,...
Summary Cell‐based fluorescence imaging assays are heterogeneous and require the collection of a large number images for detailed quantitative analysis. Complexities arise as result variation in spatial nonuniformity, shape, overlapping compartments scale (size). A new technique methodology has been developed tested delineating subcellular morphology partitioning at multiple scales. This system is packaged an integrated software platform quantifying that obtained through microscopy. Proposed...