Salima Mdhaffar

ORCID: 0000-0002-8472-6890
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Natural Language Processing Techniques
  • Speech Recognition and Synthesis
  • Speech and dialogue systems
  • Topic Modeling
  • Speech and Audio Processing
  • Music and Audio Processing
  • Intelligent Tutoring Systems and Adaptive Learning
  • Subtitles and Audiovisual Media
  • Phonetics and Phonology Research
  • Mathematics, Computing, and Information Processing
  • French Language Learning Methods
  • linguistics and terminology studies
  • Privacy-Preserving Technologies in Data
  • Semantic Web and Ontologies
  • Video Analysis and Summarization
  • Categorization, perception, and language
  • Educational Tools and Methods
  • Experimental Learning in Engineering
  • Language, Linguistics, Cultural Analysis
  • Lexicography and Language Studies
  • Hate Speech and Cyberbullying Detection
  • Innovations in Educational Methods
  • Geophysical Methods and Applications

Laboratoire Informatique d'Avignon
2020-2024

Le Mans Université
2019-2020

University of Maine School of Law
2019

University of Sfax
2019

Self-Supervised Learning (SSL) using huge unlabeled data has been successfully explored for image and natural language processing. Recent works also investigated SSL from speech. They were notably successful to improve performance on downstream tasks such as automatic speech recognition (ASR). While these suggest it is possible reduce dependence labeled building efficient systems, their evaluation was mostly made ASR multiple heterogeneous experimental settings (most of them English). This...

10.21437/interspeech.2021-556 article EN Interspeech 2022 2021-08-27

This paper investigates methods to effectively retrieve speaker information from the personalized adapted neural network acoustic models (AMs) in automatic speech recognition (ASR). problem is especially important context of federated learning ASR where a global model learnt on server based updates received multiple clients. We propose an approach analyze AMs footprint so-called Indicator dataset. Using this method, we develop two attack that aim infer identity updated without access actual...

10.1109/icassp43922.2022.9746541 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

Modern Standard Arabic, as well Arabic dialect languages, are usually written without diacritics. The absence of these marks constitute a real problem in the automatic processing data by NLP tools. Indeed, writing diacritics introduces several types ambiguity. First, word diacratics could have many possible meanings depending on their diacritization. Second, undiacritized surface forms an might 200 readings complexity its morphology [12]. In fact, agglutination property produce that can only...

10.1145/3297278 article EN ACM Transactions on Asian and Low-Resource Language Information Processing 2019-07-12

The widespread of powerful personal devices capable collecting voice their users has opened the opportunity to build speaker adapted speech recognition system (ASR) or participate collaborative learning ASR. In both cases, personalized acoustic models (AM), i.e. fine-tuned AM with specific data, can be built. A question that naturally arises is whether dissemination leak information. this paper, we show it possible retrieve gender speaker, but also his identity, by just exploiting weight...

10.1109/icassp43922.2022.9747231 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

This paper presents a study on the use of federated learning to train an ASR model based wav2vec 2.0 pre-trained by self supervision. Carried out well-known TED-LIUM 3 dataset, our experiments show that such can obtain, with no language model, word error rate 10.92% official TEDLIUM test set, without sharing any data from different users. We also analyse performance for speakers depending their participation learning. Since was first introduced privacy purposes, we measure its ability...

10.1109/icassp49357.2023.10096426 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023-05-05

Recent works showed that end-to-end neural approaches tend to become very popular for spoken language understanding (SLU).Through the term end-to-end, one considers use of a single model optimized extract semantic information directly from speech signal.A major issue such models is lack paired audio and textual data with annotation.In this paper, we propose an approach build in scenario which zero available.Our based on external trained generate sequence vectorial representations text.These...

10.21437/interspeech.2022-10231 article EN Interspeech 2022 2022-09-16

Antoine Laurent, Souhir Gahbiche, Ha Nguyen, Haroun Elleuch, Fethi Bougares, Thiol, Hugo Riguidel, Salima Mdhaffar, Gaëlle Laperrière, Lucas Maison, Sameer Khurana, Yannick Estève. Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023). 2023.

10.18653/v1/2023.iwslt-1.18 article EN cc-by 2023-01-01

Hang Le, Florentin Barbier, Ha Nguyen, Natalia Tomashenko, Salima Mdhaffar, Souhir Gabiche Gahbiche, Benjamin Lecouteux, Didier Schwab, Yannick Estève. Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021). 2021.

10.18653/v1/2021.iwslt-1.20 preprint EN cc-by 2021-01-01

Speech encoders pretrained through self-supervised learning (SSL) have demonstrated remarkable performance in various downstream tasks, including Spoken Language Understanding (SLU) and Automatic Recognition (ASR). For instance, fine-tuning SSL models for such tasks has shown significant potential, leading to improvements the SOTA across challenging datasets. In contrast existing research, this paper contributes by comparing effectiveness of approaches context (i) low-resource spoken...

10.48550/arxiv.2407.04533 preprint EN arXiv (Cornell University) 2024-07-05

Recent works demonstrate that voice assistants do not perform equally well for everyone, but research on demographic robustness of speech technologies is still scarce. This mainly due to the rarity large datasets with controlled tags. paper introduces Sonos Voice Control Bias Assessment Dataset, an open dataset composed assistant requests North American English in music domain (1,038 speakers, 166 hours, 170k audio samples, 9,040 unique labelled transcripts) a diversity (gender, age,...

10.48550/arxiv.2405.19342 preprint EN arXiv (Cornell University) 2024-05-14

SpeechBrain is an open-source Conversational AI toolkit based on PyTorch, focused particularly speech processing tasks such as recognition, enhancement, speaker text-to-speech, and much more. It promotes transparency replicability by releasing both the pre-trained models complete "recipes" of code algorithms required for training them. This paper presents 1.0, a significant milestone in evolution toolkit, which now has over 200 recipes speech, audio, language tasks, more than 100 available...

10.48550/arxiv.2407.00463 preprint EN arXiv (Cornell University) 2024-06-29

In speaker recognition systems, embeddings lack explicit speaker-related information, posing challenges for interpretability. Recently, a binary representation of speech extracts, where coefficient indicates the presence or absence given voice attribute, has been proposed to overcome this lack. It consists an adaptation x-vector extractor followed by binarisation step. This approach proved its worth in terms explainability, but two shortcomings. Firstly, objective shared attribute modeling...

10.21437/interspeech.2024-1011 article EN Interspeech 2022 2024-09-01

This paper presents the ongoing conception of a set tools, based on live transcription speech during lectures and designed to instrument traditional as well web conferences or hybrid learning situations. The toolset exploits interactions taking place courses, keeps track them facilitates their reuse both in students' studies future iterations course delivered by teacher. Its goal is help students stay focused teacher's explanations offer greater possibilities interactions. prototype was...

10.5220/0007722403590366 article EN cc-by-nc-nd 2019-01-01

This paper presents a study on the use of federated learning to train an ASR model based wav2vec 2.0 pre-trained by self supervision. Carried out well-known TED-LIUM 3 dataset, our experiments show that such can obtain, with no language model, word error rate 10.92% official test set, without sharing any data from different users. We also analyse performance for speakers depending their participation learning. Since was first introduced privacy purposes, we measure its ability protect...

10.48550/arxiv.2302.10790 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Self-supervised learning (SSL) is at the origin of unprecedented improvements in many different domains including computer vision and natural language processing. Speech processing drastically benefitted from SSL as most current domain-related tasks are now being approached with pre-trained models. This work introduces LeBenchmark 2.0 an open-source framework for assessing building SSL-equipped French speech technologies. It includes documented, large-scale heterogeneous corpora up to 14,000...

10.48550/arxiv.2309.05472 preprint EN cc-by arXiv (Cornell University) 2023-01-01
Coming Soon ...