- Speech and Audio Processing
- COVID-19 diagnosis using AI
- Speech Recognition and Synthesis
- Phonocardiography and Auscultation Techniques
- Music and Audio Processing
- Phonetics and Phonology Research
- Radio Wave Propagation Studies
- Advanced Adaptive Filtering Techniques
- Infant Health and Development
- Precipitation Measurement and Analysis
- Image and Signal Denoising Methods
- Blind Source Separation Techniques
- Linguistics and Cultural Studies
- Natural Language Processing Techniques
- Stuttering Research and Treatment
- Respiratory and Cough-Related Research
- Employee Welfare and Language Studies
- Sparse and Compressive Sensing Techniques
- Hearing Loss and Rehabilitation
- Anomaly Detection Techniques and Applications
- Advanced Text Analysis Techniques
- Topic Modeling
- Text and Document Classification Technologies
- Forecasting Techniques and Applications
- Terrorism, Counterterrorism, and Political Violence
Indian Institute of Technology Guwahati
2022-2025
Vivekananda Global University
2023
Indian Institute of Science Bangalore
2012-2023
International Audio Laboratories Erlangen
2023
Amity University
2021
Punjab Technical University
2021
SRM University, Andhra Pradesh
2021
Carnegie Mellon University
2019-2020
Washington University in St. Louis
2020
Indian Institute of Technology Kanpur
1989
The COVID-19 pandemic presents global challenges transcending boundaries of country, race, religion, and economy.The current gold standard method for detection is the reverse transcription polymerase chain reaction (RT-PCR) testing.However, this expensive, time-consuming, violates social distancing.Also, as expected to stay a while, there need an alternate diagnosis tool which overcomes these limitations, deployable at large scale.The prominent symptoms include cough breathing...
The DiCOVA challenge aims at accelerating research in diagnosing COVID-19 using acoustics (DiCOVA), a topic the intersection of speech and audio processing, respiratory health diagnosis, machine learning.This is an open call for researchers to analyze dataset sound recordings, collected from infected non-COVID-19 individuals, two-class classification.These recordings were via crowdsourcing multiple countries, through website application.The features two tracks, one focusing on cough sounds,...
Background: The COVID-19 pandemic has highlighted the need to invent alternative respiratory health diagnosis methodologies which provide improvement with respect time, cost, physical distancing and detection performance. In this context, identifying acoustic bio-markers of diseases received renewed interest. Objective: paper, we aim design diagnostics based on analyzing acoustics symptoms data. Towards this, data is composed cough, breathing, speech signals, record, collected using a...
Machine learning is considered as the study of computer algorithms that enables machine to learn and adapt new data without any human intervention. Reinforcement a paradigm by which self-governing agent utilizes its experience communicating with an environmental situation improve behavior. The paper summarizes different reinforcement algorithms, merits demerits existing reviewed methods along applications challenges gives future research direction.
There is a growing need for diverse, high-quality stuttered speech data, particularly in the context of Indian languages. This paper introduces Project Boli, multi-lingual dataset designed to advance scientific understanding and technology development individuals who stutter, India. The constitutes (a) anonymized metadata (gender, age, country, mother tongue) responses questionnaire about how stuttering affects their daily lives, (b) captures both read (using Rainbow Passage) spontaneous...
The Second Diagnosis of COVID-19 using Acoustics (DiCOVA) Challenge aimed at accelerating the research in acoustics based detection COVID-19, a topic intersection acoustics, signal processing, machine learning, and healthcare. This paper presents details challenge, which was an open call for researchers to analyze dataset audio recordings consisting breathing, cough speech signals. data collected from individuals with without infection, task challenge two-class classification. development...
The detection of overlapping speech segments is key importance in applications involving analysis multi-party conversations. problem challenging because are typically captured as short utterances far-field microphone recordings. In this paper, we propose overlap using a neural network architecture consisting long-short term memory (LSTM) models. learns the presence by identifying spectrotemporal structure segments. order to evaluate model performance, perform experiments on simulated...
The automatic analysis of conversational audio remains difficult, in part, due to the presence multiple talkers speaking turns, often with significant intonation variations and overlapping speech. majority prior work on psychoacoustic speech system design has focused single-talker or multi-talker (for example, cocktail party effect). There been much less focus how listeners detect a change talker probing acoustic features characterizing talker's voice This study examines human detection...
The classical approach to A/D conversion has been uniform sampling and we get perfect reconstruction for bandlimited signals by satisfying the Nyquist Sampling Theorem. We propose a non-uniform scheme based on level crossing (LC) time information. show stable of bandpass with correct scale factor hence unique from only For crossings make use sparse optimization constraining signal be in its frequency content. While overdetermined system equations is resorted literature an undetermined along...
Credit scoring plays a vital role for financial institutions to estimate the risk associated with credit applicant applied product. It is estimated based on applicants’ credentials and directly affects viability of issuing institutions. However, there may be large number irrelevant features in dataset. Due features, models lead poorer classification performances higher complexity. So, by removing redundant overcome problem features. In this work, we emphasized feature selection enhance...
The Second Diagnosis of COVID-19 using Acoustics (DiCOVA) Challenge aimed at accelerating the research in acoustics based detection COVID-19, a topic intersection acoustics, signal processing, machine learning, and healthcare. This paper presents details challenge, which was an open call for researchers to analyze dataset audio recordings consisting breathing, cough speech signals. data collected from individuals with without infection, task challenge two-class classification. development...
The relative accuracy of the left and right arms in active positioning was studied a group 24 male right-handed undergraduates. task required at each four angular positions (30°, 45°, 60°, 75°). arm more accurate than arm. There progressive increase error for both as flexed reducing angle joint. Results are discussed light suggestions concerning superiority hemisphere processing kinesthetic proprioceptive information.
Human Beings know each other and contact with themselves through thoughts ideas.The best way to present our idea is speech. Some people don’t have the power of speech; only they communicate others sign language. Now a days technology has reduced gap systems which can be used change language by these Sign recognition (SLR) gesture-based control are two major applications for hand gesture technologies. On side controller converts in text speech gets converted help conversion analog digital...
Good quality time-scale modification (TSM) of speech, and audio is a long standing challenge. The crux the challenge to maintain perceptual subtilities temporal variations in pitch timbre even after time-scaling signal. Widely used approaches, such as phase vocoder, waveform overlap-add (OLA), are based on quasi-stationary assumption time-scaled signals have perceivable artifacts. In contrast these we propose application time-varying sinusoidal modeling for TSM, without any assumption....
A listening test is proposed in which human participants detect talker changes two natural, multi-talker speech stimuli sets—a familiar language (English) and an unfamiliar (Chinese). Miss rate, false-alarm response times (RT) showed a significant dependence on familiarity. Linear regression modeling of RTs using diverse acoustic features derived from the recruitment pool for change detection task. Further, benchmarking same task against state-of-the-art machine diarization system that...
Estimating the parameters of moving sound sources using only source signal is interest in low-power, and contact-less monitoring applications, such as, industrial robotics bio-acoustics. The received embeds motion attributes via Doppler effect. In this paper, we analyze effect on mixture time-varying sinusoids. Focusing, instantaneous frequency (IF) signal, show that IF profile composed its first two derivatives can be used to obtain parameters. This requires a smooth estimate profile....
We propose data acquisition from continuous-time signals belonging to the class of real-valued trigonometric polynomials using an event-triggered sampling paradigm. The schemes proposed are: level crossing (LC), close extrema LC, and sampling. Analysis robustness these jitter, bandpass additive gaussian noise is presented. In general will result in non-uniformly spaced sample instants. address issue signal reconstruction acquired data-set by imposing structure sparsity on model circumvent...
The COVID-19 outbreak resulted in multiple waves of infections that have been associated with different SARS-CoV-2 variants.Studies reported differential impact the variants on respiratory health patients.We explore whether acoustic signals, collected from subjects, show computationally distinguishable patterns suggesting a possibility to predict underlying virus variant.We analyze Coswara dataset which is three subject pools, namely, i) healthy, ii) subjects recorded during delta variant...
Traditional approaches for understanding phonological learning have predominantly relied on curated text data. Although insightful, such limit the knowledge captured in textual representations of spoken language. To overcome this limitation, we investigate potential Featural InfoWaveGAN model to learn iterative long-distance vowel harmony using raw speech We focus Assamese, a language known its phonologically regressive and word-bound harmony. demonstrate that is adept at grasping...
Traditional approaches for understanding phonological learning have predominantly relied on curated text data. Although insightful, such limit the knowledge captured in textual representations of spoken language. To overcome this limitation, we investigate potential Featural InfoWaveGAN model to learn iterative long-distance vowel harmony using raw speech We focus Assamese, a language known its phonologically regressive and word-bound harmony. demonstrate that is adept at grasping...