Sridha Sridharan

ORCID: 0000-0003-4316-9001
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Speech and Audio Processing
  • Speech Recognition and Synthesis
  • Video Surveillance and Tracking Methods
  • Music and Audio Processing
  • Human Pose and Action Recognition
  • Anomaly Detection Techniques and Applications
  • Face recognition and analysis
  • Advanced Image and Video Retrieval Techniques
  • Face and Expression Recognition
  • Advanced Vision and Imaging
  • Robotics and Sensor-Based Localization
  • Advanced Adaptive Filtering Techniques
  • Advanced Data Compression Techniques
  • Natural Language Processing Techniques
  • Advanced Neural Network Applications
  • Image Retrieval and Classification Techniques
  • Domain Adaptation and Few-Shot Learning
  • Emotion and Mood Recognition
  • Video Analysis and Summarization
  • Biometric Identification and Security
  • Gait Recognition and Analysis
  • Multimodal Machine Learning Applications
  • Indoor and Outdoor Localization Technologies
  • Blind Source Separation Techniques
  • EEG and Brain-Computer Interfaces

Sri Eshwar College of Engineering
2025

Queensland University of Technology
2015-2024

Amrita Vishwa Vidyapeetham
2024

Vision Australia
2020-2024

PSG INSTITUTE OF TECHNOLOGY AND APPLIED RESEARCH
2024

SASTRA University
2024

Bharathidasan University
2010-2021

Signal Processing (United States)
2002-2020

Institute of Electrical and Electronics Engineers
2005-2020

SRI International
2018

In public venues, crowd size is a key indicator of safety and stability. Crowding levels can be detected using holistic image features, however this requires large amount training data to capture the wide variations in distribution. If counting algorithm deployed across number cameras, such burdensome requirement far from ideal. paper we propose an approach that uses local features count people each foreground blob segment, so total estimate sum group sizes. This results scalable volumes not...

10.1109/dicta.2009.22 article EN Digital Image Computing: Techniques and Applications 2009-01-01

Iris recognition refers to the automated process of recognizing individuals based on their iris patterns. The seemingly stochastic nature stroma makes it a distinctive cue for biometric recognition. textural nuances an individual's pattern can be effectively extracted and encoded by projecting them onto Gabor wavelets transforming ensuing phasor response into binary code - technique pioneered Daugman. This descriptor has been observed robust feature with very low false match rates...

10.1109/access.2017.2784352 article EN cc-by-nc-nd IEEE Access 2017-12-18

In a clinical setting, pain is reported either through patient self-report or via an observer. Such measures are problematic as they are: 1) subjective, and 2) give no specific timing information. Coding series of facial action units (AUs) can avoid these issues it be used to gain objective measure on frame-by-frame basis. Using video data from patients with shoulder injuries, in this paper, we describe active appearance model (AAM)-based system that automatically detect the frames which...

10.1109/tsmcb.2010.2082525 article EN IEEE Transactions on Systems Man and Cybernetics Part B (Cybernetics) 2010-11-30

Robust speaker verification on short utterances remains a key consideration when deploying automatic recognition, as many real world applications often have access to only limited duration speech data. This paper explores how the recent technologies focused around total variability modeling behave training and testing utterance lengths are reduced. Results presented which provide comparison of Joint Factor Analysis (JFA) i-vector based systems including various compensation techniques;...

10.21437/interspeech.2011-58 article EN Interspeech 2022 2011-08-27

In this paper we address the problem of human action recognition from video sequences. Inspired by exemplary results obtained via automatic feature learning and deep approaches in computer vision, focus our attention towards salient spatial features a convolutional neural network (CNN) then map their temporal relationship with aid Long-Short-Term-Memory (LSTM) networks. Our contribution is fusion framework that more effectively exploits CNNs LSTM models. We also extensively evaluate...

10.1109/wacv.2017.27 article EN 2017-03-01

Epilepsy is one of the most prevalent neurological diseases among humans and can lead to severe brain injuries, strokes, tumors. Early detection seizures help mitigate be used aid treatment patients with epilepsy. The purpose a seizure prediction system successfully identify pre-ictal stage, which occurs before event. Patient-independent models are designed offer accurate performance across multiple subjects within dataset, have been identified as real-world solution problem. However, little...

10.1109/jsen.2021.3057076 article EN IEEE Sensors Journal 2021-02-06

The problem of determining the script and language a document image has number important applications in field analysis, such as indexing sorting large collections images, or precursor to optical character recognition (OCR). In this paper, we investigate use texture tool for image, based on observation that text distinct visual texture. An experimental evaluation commonly used features is conducted newly created database, providing qualitative measure which are most appropriate task....

10.1109/tpami.2005.227 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2005-09-26

The QUT-NOISE-TIMIT corpus consists of 600 hours noisy speech sequences designed to enable a thorough evaluation voice activity detection (VAD) algorithms across wide variety common background noise scenarios.In order construct the final mixed-speech database, collection over 10 was conducted unique locations covering 5 scenarios, create QUT-NOISE corpus.This then mixed with events chosen from TIMIT clean lengths, signal-to-noise ratios (SNRs) and active proportions form corpus.The five...

10.21437/interspeech.2010-774 article EN Interspeech 2022 2010-09-26

Gait energy images (GEIs) and its variants form the basis of many recent appearance-based gait recognition systems. The GEI combines good performance with a simple implementation, though it suffers problems inherent to approaches, such as being highly view dependent. In this paper, we extend concept 3D, create what call volume, or GEV. A basic GEV implementation is tested on CMU MoBo database, showing improvements over both baseline fused multi-view approach. We also demonstrate efficacy...

10.1109/ijcb.2011.6117504 article EN 2011-10-01

Accurate and efficient thermal-infrared (IR) camera calibration is important for advancing computer vision research within the thermal modality. This paper presents an approach geometrically calibrating individual multiple cameras in both visible modalities. The proposed technique can be used to correct lens distortion simultaneously reference thermal-IR a single coordinate frame. most popular existing geometric of uses printed chessboard heated by flood lamp comparatively inaccurate...

10.1109/tim.2012.2182851 article EN IEEE Transactions on Instrumentation and Measurement 2012-02-03

Although the collection of player and ball tracking data is fast becoming norm in professional sports, large-scale mining such spatiotemporal has yet to surface. In this paper, given an entire season's worth from a soccer league (≈400,000,000 points), we present method which can conduct both individual team analysis. Due dynamic, continuous multi-player nature sports like soccer, major issue aligning positions over time. We "role-based" representation that dynamically updates each player's...

10.1109/icdm.2014.133 article EN 2014-12-01

Person re-identification involves recognising individuals in different locations across a network of cameras and is challenging task due to large number varying factors such as pose (both subject camera) ambient lighting conditions. Existing databases do not adequately capture these variations, making evaluations proposed techniques difficult. In this paper, we present new multi-camera surveillance database designed for the person re-identification. This consists 150 unscripted sequences...

10.1109/dicta.2012.6411689 article EN 2012-12-01

The concept of continuous-time trajectory representation has brought increased accuracy and efficiency to multi-modal sensor fusion in modern SLAM.However, regardless these advantages, its offline property caused by the requirement global batch optimization is critically hindering relevance for real-time life-long applications.In this paper, we present a dense map-centric SLAM method based on cope with problem.The proposed system locally functions similar fashion conventional Continuous-Time...

10.1109/icra.2018.8462915 article EN 2018-05-01

Objective: This paper proposes a novel framework for the segmentation of phonocardiogram (PCG) signals into heart states, exploiting temporal evolution PCG as well considering salient information that it provides detection state. Methods: We propose use recurrent neural networks and exploit recent advancements in attention based learning to segment signal. allows network identify most aspects signal disregard uninformative information. Results: The proposed method attains state-of-the-art...

10.1109/jbhi.2019.2949516 article EN IEEE Journal of Biomedical and Health Informatics 2019-10-25

Traditionally, abnormal heart sound classification is framed as a three-stage process. The first stage involves segmenting the phonocardiogram to detect fundamental sounds; after which features are extracted and performed. Some researchers in field argue segmentation step an unwanted computational burden, whereas others embrace it prior feature extraction. When comparing accuracies achieved by studies that have segmented sounds before analysis with those who overlooked step, question of...

10.1109/jbhi.2020.3027910 article EN IEEE Journal of Biomedical and Health Informatics 2020-09-30
Coming Soon ...