Babak Naderi

ORCID: 0009-0006-4778-5417
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Speech and Audio Processing
  • Mobile Crowdsensing and Crowdsourcing
  • Image and Video Quality Assessment
  • Hearing Loss and Rehabilitation
  • Advanced Adaptive Filtering Techniques
  • Speech Recognition and Synthesis
  • Music and Audio Processing
  • Open Source Software Innovations
  • Indoor and Outdoor Localization Technologies
  • Technology Adoption and User Behaviour
  • Video Coding and Compression Technologies
  • Visual Attention and Saliency Detection
  • Advanced Image Processing Techniques
  • Natural Language Processing Techniques
  • Advanced Data Compression Techniques
  • Text Readability and Simplification
  • Speech and dialogue systems
  • Virtual Reality Applications and Impacts
  • Forecasting Techniques and Applications
  • Advanced Computing and Algorithms
  • Impact of Technology on Adolescents
  • Power Line Communications and Noise
  • Interactive and Immersive Displays
  • Evacuation and Crowd Dynamics
  • Image and Signal Denoising Methods

Microsoft (United States)
2024-2025

Microsoft (Finland)
2023-2024

Technische Universität Berlin
2012-2022

Isfahan University of Medical Sciences
2012-2022

University of Mohaghegh Ardabili
2021

Deutsche Telekom (Germany)
2012-2015

Razi University
2013-2014

In this paper, we present an update to the NISQA speech quality prediction model that is focused on distortions occur in communication networks. contrast previous version, trained end-to-end and time-dependency modelling time-pooling achieved through a Self-Attention mechanism. Besides overall quality, also predicts four dimensions Noisiness, Coloration, Discontinuity, Loudness, way gives more insight into cause of degradation. Furthermore, new datasets with over 13,000 files were created...

10.21437/interspeech.2021-299 article EN Interspeech 2022 2021-08-27

The ITU-T Recommendation P.808 provides a crowdsourcing approach for conducting subjective assessment of speech quality using the Absolute Category Rating (ACR) method. We provide an open-source implementation Rec. that runs on Amazon Mechanical Turk platform. extended our to include Degradation Ratings (DCR) and Comparison (CCR) test methods. also significantly speed up process by integrating participant qualification step into main rating task compared two-stage solution. program scripts...

10.21437/interspeech.2020-2665 article EN Interspeech 2022 2020-10-25

The ICASSP 2023 Deep Noise Suppression (DNS) Challenge marks the fifth edition of DNS challenge series. challenges were organized from 2019 to foster research in field DNS. Previous held at INTERSPEECH 2020, 2021, and 2022. This aims advance models capable jointly addressing denoising, dereverberation, interfering talker suppression, with separate tracks focusing on headset speakerphone scenarios. facilitates personalized deep noise suppression by providing accompanying enrollment clips for...

10.1109/ojsp.2024.3378602 article EN cc-by-nc-nd IEEE Open Journal of Signal Processing 2024-01-01

The ICASSP 2023 Speech Signal Improvement Challenge is intended to stimulate research in the area of improving speech signal quality communication systems. can be measured with SIG ITU-T P.835 and still a top issue audio conferencing For example, 2022 Deep Noise Suppression challenge, improvement background overall impressive, but not statistically significant. To improve following impairment areas must addressed: coloration, discontinuity, loudness, reverberation, noise. A training test set...

10.1109/ojsp.2024.3376293 article EN cc-by-nc-nd IEEE Open Journal of Signal Processing 2024-01-01

Subjective speech quality assessment is the gold standard for evaluating enhancement processing and telecommunication systems. The commonly used ITU-T Rec. P.800 defines how to measure in lab environments, P.808 extended it crowdsourcing. P.835 extends of presence noise. P.804 targets conversation test introduces perceptual dimensions which are measured during listening phase conversation. noisiness, coloration, discontinuity, loudness. We create a crowd-sourcing implementation...

10.1109/icassp48485.2024.10447225 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024-03-18

With the coming of age virtual/augmented reality and interactive media, numerous definitions, frameworks, models immersion have emerged across different fields ranging from computer graphics to literary works. Immersion is oftentimes used interchangeably with presence as both concepts are closely related. However, there noticeable interdisciplinary differences regarding scope, constituents that required be addressed so a coherent understanding can achieved. Such consensus vital for paving...

10.48550/arxiv.2007.07032 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Cloud Gaming (CG) is an immersive multimedia service that promises many benefits. In CG, the games are rendered in a cloud server, and resulted scenes streamed as video sequence to client. Using CG users not forced update their gaming hardware frequently, available can be played on any operating system or suitable device. However, requires reliable low-latency network, which makes it very challenging service. Transmission latency strongly affects playability of game consequently reduces...

10.1145/3339825.3391855 article EN 2020-05-27

With the advances in speech communication systems such as online conferencing applications, we can seamlessly work with people regardless of where they are. However, during meetings, quality be significantly affected by background noise, reverberation, packet loss, network jitter, etc. Because its nature, is traditionally assessed subjective tests laboratories and lately also crowdsourcing following international standards from ITU-T Rec. P.800 series. those approaches are costly cannot...

10.21437/interspeech.2022-10597 article EN Interspeech 2022 2022-09-16

We propose an open-source extension of the ITU-T Rec. P.910 subjective video quality test based on crowdsourcing principles. This addresses speed, usage cost, and barrier to issues P.910. implement Absolute Category Rating (ACR), ACR with hidden reference (ACRHR), Degradation (DCR), Comparison (CCR), include rater, environment, hardware, network qualifications, as well gold trapping questions ensure quality. have validated that implementation is both accurate highly reproducible.

10.1109/icassp48485.2024.10446509 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024-03-18

The quality of the speech communication systems, which include noise suppression algorithms, are typically evaluated in laboratory experiments according to ITU-T Rec.P.835, participants rate background noise, signal, and overall separately.This paper introduces an open-source toolkit for conducting subjective evaluation suppressed crowdsourcing.We followed P.808 highly automate process prevent moderator's error.To assess validity our method, we compared Mean Opinion Scores (MOS), calculate...

10.21437/interspeech.2021-343 article EN Interspeech 2022 2021-08-27

Recently, a new authentication method based on 3D signatures created in air is proposed for mobile devices [4]. The signature using properly shaped magnet (a rod or ring) taken hand. It influencing compass sensor embedded the generation of devices. In this paper, we present implementation technology device (iPhone 3GS). can demonstrate process gesture from freely space around by held Movement produces temporal change magnetic field sensed sensor, and be used as basis authentication. As are...

10.1145/2371664.2371705 article EN 2012-09-21

Abstract Subjective speech quality assessment has traditionally been carried out in laboratory environments under controlled conditions. With the advent of crowdsourcing platforms tasks, which need human intelligence, can be resolved by crowd workers over Internet. Crowdsourcing also offers a new paradigm for assessment, promising higher ecological validity judgments at expense potentially lower reliability. This paper compares laboratory-based and crowdsourcing-based assessments terms...

10.1007/s41233-020-00042-1 article EN cc-by Quality and User Experience 2020-11-22

The rank correlation coefficients and the ranked-based statistical tests (as a subset of non-parametric techniques) might be misleading when they are applied to subjectively collected opinion scores. Those techniques assume that data is measured at least an ordinal level define sequence scores represent tied have precisely equal numeric value. In this paper, we show definition rank, as mentioned above, not suitable for Mean Opinion Scores (MOS) conclusions rank-based techniques. Furthermore,...

10.1109/qomex48832.2020.9123078 preprint EN 2020-05-01

The quality of acoustic echo cancellers (AECs) in real-time communication systems is typically evaluated using objective metrics like ERLE [1] and PESQ [2], less commonly with lab-based subjective tests ITU-T Rec. P.831 [3]. We will show that these measures are not well correlated to measures. then introduce an open-source crowdsourcing approach for evaluation impairment which can be used evaluate the performance AECs. provide a study shows this tool highly reproducible. This new has been...

10.1109/icassp39728.2021.9414904 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021-05-13

Commonly used datasets for evaluating video codecs are all very high quality and not representative of typically in conferencing scenarios. We present the Video Conferencing Dataset (VCD) real-time communication, first such dataset focused on conferencing. VCD includes a wide variety camera qualities spatial temporal information. It both desktop mobile scenarios two types background processing. report compression efficiency H.264, H.265, H.266, AV1 low-delay settings compare it with...

10.1109/icassp48485.2024.10448484 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024-03-18

In this paper, the reliability of responses collected in two crowdsourcing studies is compared. Two methods to evaluate (one noticeable and one unnoticeable method for workers) have been employed. The included both studies; employed study only. containing check resulted a higher consistency than other one. We assume that difference result obvious method: Workers improve their performance due awareness they are being observed.

10.1109/qomex.2015.7148091 article EN 2015-05-01
Coming Soon ...