Chitralekha Gupta

ORCID: 0000-0003-1350-9095
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Music and Audio Processing
  • Speech and Audio Processing
  • Music Technology and Sound Studies
  • Speech Recognition and Synthesis
  • Neuroscience and Music Perception
  • Diverse Musicological Studies
  • Tactile and Sensory Interactions
  • Generative Adversarial Networks and Image Synthesis
  • Phonetics and Phonology Research
  • Stroke Rehabilitation and Recovery
  • Cancer-related molecular mechanisms research
  • Virtual Reality Applications and Impacts
  • Multisensory perception and integration
  • Image Processing and 3D Reconstruction
  • Assistive Technology in Communication and Mobility
  • Gaze Tracking and Assistive Technology
  • Human Motion and Animation
  • Topic Modeling
  • Sentiment Analysis and Opinion Mining
  • Child Development and Digital Technology
  • Reading and Literacy Development
  • Epigenetics and DNA Methylation
  • Natural Language Processing Techniques
  • ICT in Developing Communities
  • Image and Video Quality Assessment

National University of Singapore
2017-2025

Indian Institute of Technology Bombay
2010-2013

Human augmentation is a field of research that aims to enhance human abilities through medicine or technology. This has historically been achieved by consuming chemical substances improve selected ability installing ...

10.1145/3702656 article EN interactions 2025-01-01

10.1109/icassp49660.2025.10890164 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025-03-12

Researchers have used machine learning approaches to identify motion sickness in VR experience. These would certainly benefit from an accurately labeled, real-world, diverse dataset that enables the development of generalizable ML models. We introduce 'VR.net', a comprising 165-hour gameplay videos 100 real-world games spanning ten genres, evaluated by 500 participants. VR.net assigns 24 sickness-related labels for each video frame, such as camera/object movement, depth field, and flow....

10.1109/tvcg.2024.3372044 article EN IEEE Transactions on Visualization and Computer Graphics 2024-03-04

Lyrics are the words that make up a song, while chords harmonic sets of multiple notes in music. and generally essential information music, i.e. unaccompanied singing vocals mixed with instrumental representing important components polyphonic In traditional lyrics transcription task, we first extract from music then transcribe resulting vocals, where two steps optimized independently. this paper, propose novel end-to-end network architectures designed to disentangle for effective single...

10.1109/taslp.2022.3190742 article EN cc-by-nc-nd IEEE/ACM Transactions on Audio Speech and Language Processing 2022-01-01

Automatic lyrics alignment and transcription in polyphonic music are challenging tasks because the singing vocals corrupted by background music. In this work, we propose to learn genre-specific characteristics train acoustic models. We first compare several automatic speech recognition pipelines for application of transcription. then present performance music-informed models best-performing pipeline, systematically study impact genre language model on performance. With such genre-based...

10.1109/icassp40776.2020.9054567 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020-04-09

A perceptually valid automatic singing evaluation score could serve as a complement to lessons, and make training more reachable the masses. In this study, we adopt idea behind PESQ (Perceptual Evaluation of Speech Quality) scoring metrics, propose various relevant features evaluate quality. We correlate obtained quality score, which term Perceptual Singing Quality (PESnQ) with that given by music-expert human judges, compare results known baseline systems. It is shown proposed PESnQ has...

10.1109/apsipa.2017.8282110 article EN 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 2017-12-01

Lyrics-to-audio alignment is to automatically align the lyrical words with mixed singing audio (singing voice+musical accompaniment). Such can be achieved an automatic speech recognition (ASR) system. We propose adapt acoustic model of a recognizer towards solo voice. This avoids hurdles annotating large polyphonic music training dataset. Moreover, lexicon-modification based duration modelling has been incorporated account for long vowels in singing. As practical application demand on music,...

10.1109/icassp.2019.8682582 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019-04-17

10.5281/zenodo.1492487 article EN International Symposium/Conference on Music Information Retrieval 2018-09-23

Singing, the vocal production of musical tones, is one most important elements music. Addressing needs real-world applications, study technologies related to singing voices has become an increasingly active area research. In this paper, we provide a comprehensive overview recent developments in field information processing, specifically topics skill evaluation, voice synthesis, separation, and lyrics synchronization transcription. We will especially focus on deep learning approaches...

10.1109/taslp.2022.3190732 article EN cc-by IEEE/ACM Transactions on Audio Speech and Language Processing 2022-01-01

In this paper, we propose a data-driven approach to train Generative Adversarial Network (GAN) conditioned on "soft-labels" distilled from the penultimate layer of an audio classifier trained target set texture classes. We demonstrate that interpolation between such conditions or control vectors provide smooth morphing generated textures, and show similar better capability compared state-of-the-art methods. The proposed results in well-organized latent space generates novel outputs while...

10.1109/icassp49357.2023.10096328 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023-05-05

Lyrics transcription of polyphonic music is challenging as the background affects lyrics intelligibility. Typically, can be performed by a two-step pipeline, i.e. singing vocal extraction front end, followed transcriber back where end and are trained separately. Such pipeline suffers from both imperfect mismatch between end. In this work, we propose novel end-to-end integrated fine-tuning framework, that call PoLyScriber, to globally optimize extractor for in music. The experimental results...

10.1109/taslp.2023.3275036 article EN cc-by-nc-nd IEEE/ACM Transactions on Audio Speech and Language Processing 2023-01-01

10.21437/interspeech.2018-1267 article EN Interspeech 2022 2018-08-28

Automatic lyrics to polyphonic audio alignment is a challenging task not only because the vocals are corrupted by background music, but also there lack of annotated corpus for effective acoustic modeling. In this work, we propose (1) using additional speech and music-informed features (2) adapting models trained on large amount solo singing towards music small in-domain data. Incorporating information such as voicing auditory together with conventional aims bring robustness against increased...

10.21437/interspeech.2019-1520 article EN Interspeech 2022 2019-09-13

Automatic evaluation of singing quality can be done with the help a reference or digital sheet music song. However, such standard is not always available. In this article, we propose framework to rank large pool singers according their without any reference. We define musically motivated absolute measures based on pitch histogram, and relative inter-singer statistics evaluate attributes as intonation, rhythm. The goodness histogram specific singer, while use similarity between in terms...

10.1109/taslp.2019.2947737 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2019-10-16

Lyrics transcription of polyphonic music is challenging not only because the singing vocals are corrupted by background music, but also and style vary across genres, such as pop, metal, hip hop, which affects lyrics intelligibility song in different ways. In this work, we propose to transcribe using a novel genre-conditioned network. The proposed network adopts pre-trained model parameters, incorporates genre adapters between layers capture peculiarities for lyrics-genre pairs, thereby...

10.1109/icassp43922.2022.9747684 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

Assistive Augmentation, the intersection of human-computer interaction, assistive technologies and human augmentation, was broadly discussed at CHI'14 workshop subsequently published as an edited volume on Springer Cognitive Science Technology series. In this workshop, aim is to propose a more structured way design Augmentations. addition, we discuss challenges opportunities for Augmentations in light current trends research technology. Participants need submit short position paper or...

10.1145/3582700.3582729 article EN 2023-03-12

Novel AI-generated audio samples are evaluated for descriptive qualities such as the smoothness of a morph using crowdsourced human listening tests. However, methods to design interfaces experiments and effectively articulate quality under test receive very little attention in evaluation metrics literature. In this paper, we explore use visual metaphors image-schema evaluate audio. Furthermore, highlight importance framing contextualizing measurement constructs. Using both pitched sounds...

10.1145/3581641.3584083 article EN 2023-03-27

Spatial awareness, particularly awareness of distant environmental scenes known as vista-space, is crucial and contributes to the cognitive aesthetic needs People with Visual Impairments (PVI). In this work, through a formative study PVIs, we establish need for vista-space amongst people visual impairments, possible scenarios where would be helpful. We investigate potential existing sonification techniques well AI-based audio generative models design sounds that can create scenes. Our first...

10.1145/3659609 article EN Proceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies 2024-05-13
Coming Soon ...