- Anomaly Detection Techniques and Applications
- Human Pose and Action Recognition
- Image Processing Techniques and Applications
- Human Motion and Animation
- Adversarial Robustness in Machine Learning
- Domain Adaptation and Few-Shot Learning
- Generative Adversarial Networks and Image Synthesis
- Advanced Vision and Imaging
- Advanced Image Processing Techniques
- Spectroscopy Techniques in Biomedical and Chemical Research
- Robotics and Sensor-Based Localization
- Handwritten Text Recognition Techniques
- Cancer-related molecular mechanisms research
- Multimodal Machine Learning Applications
- Hand Gesture Recognition Systems
- Advanced Image and Video Retrieval Techniques
- Advanced Malware Detection Techniques
- Optical Imaging and Spectroscopy Techniques
- Spectroscopy and Chemometric Analyses
Indian Institute of Science Bangalore
2018-2019
Indian Institute of Technology Guwahati
2018
Supervised deep learning methods have shown promising results for the task of monocular depth estimation; but acquiring ground truth is costly, and prone to noise as well inaccuracies. While synthetic datasets been used circumvent above problems, resultant models do not generalize natural scenes due inherent domain shift. Recent adversarial approaches adaption performed in mitigating differences between source target domains. But these are mostly limited a classification setup scale...
An unsupervised human action modeling framework can provide useful pose-sequence representation, which be utilized in a variety of pose analysis applications. In this work we propose novel temporal framework, embed the dynamics 3D human-skeleton joints to latent space an efficient manner. contrast end-to-end explored by previous works, disentangle task individual representation learning from actions as sequence embeddings. order realize continuous embedding manifold along with better...
Hyperspectral imaging (HSI) of tissue samples in the mid-infrared (mid-IR) range provides spectro-chemical and structure information at sub-cellular spatial resolution. Disease states can be directly assessed by analyzing mid-IR spectra different cell types (e.g., epithelial cells) components nuclei), provided that we accurately classify pixels belonging to these components. The challenge is extract from hundreds noisy bands each pixel, where band not very informative itself, making...
While OCR has been used in various applications, its output is not always accurate, leading to misfit words. This research work focuses on improving the optical character recognition (OCR) with ML techniques integration of long short-term memory (LSTM) based sequence deep learning models perform document translation. ANKI dataset for English Spanish In this work, I have shown comparative study pre-trained while using model LSTM-based seq2seq architecture attention machine End-to-end...
Deep learning models are susceptible to input specific noise, called adversarial perturbations. Moreover, there exist input-agnostic Universal Adversarial Perturbations (UAP) that can affect inference of the over most samples. Given a model, broadly two approaches craft UAPs: (i) data-driven: require data, and (ii) data-free: do not data Data-driven actual samples from underlying distribution UAPs with high success (fooling) rate. However, data-free without utilizing any therefore result in...
This research paper focuses on the problem of dynamic objects and their impact effective motion planning localization. The proposes a two-step process to address this challenge, which involves finding in scene using Flow-based method then deep Video inpainting algorithm remove them. study aims test validity approach by comparing it with baseline results two state-of-the-art SLAM algorithms, ORB-SLAM2 LSD, understanding corresponding trade-offs. proposed does not require any significant...
The recent success of the CLIP model has shown its potential to be applied a wide range vision and language tasks. However this only establishes embedding space relationship images, not video domain. In paper, we propose novel approach map natural langugage. We two-stage that first extracts visual features from each frame using pre-trained CNN, then uses encode for domain, along with corresponding text descriptions. evaluate our method on two benchmark datasets, UCF101 HMDB51, achieve...
An unsupervised human action modeling framework can provide useful pose-sequence representation, which be utilized in a variety of pose analysis applications. In this work we propose novel temporal framework, embed the dynamics 3D human-skeleton joints to continuous latent space an efficient manner. contrast end-to-end explored by previous works, disentangle task individual representation learning from actions as trajectory embedding space. order realize manifold with improved...
Supervised deep learning methods have shown promising results for the task of monocular depth estimation; but acquiring ground truth is costly, and prone to noise as well inaccuracies. While synthetic datasets been used circumvent above problems, resultant models do not generalize natural scenes due inherent domain shift. Recent adversarial approaches adaption performed in mitigating differences between source target domains. But these are mostly limited a classification setup scale...