NFDI4DS | UHH-SEMS - Publication Details

Salima Mdhaffar

ORCID: 0000-0002-8472-6890

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5056172842

Research Areas

Natural Language Processing Techniques
Speech Recognition and Synthesis
Speech and dialogue systems
Topic Modeling
Speech and Audio Processing
Music and Audio Processing
Intelligent Tutoring Systems and Adaptive Learning
Subtitles and Audiovisual Media
Phonetics and Phonology Research
Mathematics, Computing, and Information Processing
French Language Learning Methods
linguistics and terminology studies
Privacy-Preserving Technologies in Data
Semantic Web and Ontologies
Video Analysis and Summarization
Categorization, perception, and language
Educational Tools and Methods
Experimental Learning in Engineering
Language, Linguistics, Cultural Analysis
Lexicography and Language Studies
Hate Speech and Cyberbullying Detection
Innovations in Educational Methods
Geophysical Methods and Applications

Laboratoire Informatique d'Avignon
2020-2024

Le Mans Université
2019-2020

University of Maine School of Law
2019

University of Sfax
2019

LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech

OPENALEX - Publications

Solène Evain Ha-Thanh Nguyen Hang Le Marcely Zanon Boito Salima Mdhaffar and 13 more

Self-Supervised Learning (SSL) using huge unlabeled data has been successfully explored for image and natural language processing. Recent works also investigated SSL from speech. They were notably successful to improve performance on downstream tasks such as automatic speech recognition (ASR). While these suggest it is possible reduce dependence labeled building efficient systems, their evaluation was mostly made ASR multiple heterogeneous experimental settings (most of them English). This...

10.21437/interspeech.2021-556 article EN Interspeech 2022 2021-08-27

LeBenchmark 2.0: A standardized, replicable and enhanced framework for self-supervised representations of French speech

OPENALEX - Publications

Titouan Parcollet Ha H. Nguyen Solène Evain Marcely Zanon Boito Adrien Pupier and 17 more

10.1016/j.csl.2024.101622 article EN Computer Speech & Language 2024-02-03

Privacy Attacks for Automatic Speech Recognition Acoustic Models in A Federated Learning Framework

OPENALEX - Publications

Natalia Tomashenko Salima Mdhaffar Marc Tommasi Yannick Estève Jean-François Bonastre

This paper investigates methods to effectively retrieve speaker information from the personalized adapted neural network acoustic models (AMs) in automatic speech recognition (ASR). problem is especially important context of federated learning ASR where a global model learnt on server based updates received multiple clients. We propose an approach analyze AMs footprint so-called Indicator dataset. Using this method, we develop two attack that aim infer identity updated without access actual...

10.1109/icassp43922.2022.9746541 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

Automatic Diacritics Restoration for Tunisian Dialect

OPENALEX - Publications

Abir Masmoudi Salima Mdhaffar Rahma Sellami Lamia Hadrich Belguith

Modern Standard Arabic, as well Arabic dialect languages, are usually written without diacritics. The absence of these marks constitute a real problem in the automatic processing data by NLP tools. Indeed, writing diacritics introduces several types ambiguity. First, word diacratics could have many possible meanings depending on their diacritization. Second, undiacritized surface forms an might 200 readings complexity its morphology [12]. In fact, agglutination property produce that can only...

10.1145/3297278 article EN ACM Transactions on Asian and Low-Resource Language Information Processing 2019-07-12

Retrieving Speaker Information from Personalized Acoustic Models for Speech Recognition

OPENALEX - Publications

Salima Mdhaffar Jean-François Bonastre Marc Tommasi Natalia Tomashenko Yannick Estève

The widespread of powerful personal devices capable collecting voice their users has opened the opportunity to build speaker adapted speech recognition system (ASR) or participate collaborative learning ASR. In both cases, personalized acoustic models (AM), i.e. fine-tuned AM with specific data, can be built. A question that naturally arises is whether dissemination leak information. this paper, we show it possible retrieve gender speaker, but also his identity, by just exploiting weight...

10.1109/icassp43922.2022.9747231 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

Federated Learning for ASR Based on wav2vec 2.0

OPENALEX - Publications

Tuan Nguyen Salima Mdhaffar Natalia Tomashenko Jean-François Bonastre Yannick Estève

This paper presents a study on the use of federated learning to train an ASR model based wav2vec 2.0 pre-trained by self supervision. Carried out well-known TED-LIUM 3 dataset, our experiments show that such can obtain, with no language model, word error rate 10.92% official TEDLIUM test set, without sharing any data from different users. We also analyse performance for speakers depending their participation learning. Since was first introduced privacy purposes, we measure its ability...

10.1109/icassp49357.2023.10096426 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023-05-05

End-to-end model for named entity recognition from speech without paired training data

OPENALEX - Publications

Salima Mdhaffar Jarod Duret Titouan Parcollet Yannick Estève

Recent works showed that end-to-end neural approaches tend to become very popular for spoken language understanding (SLU).Through the term end-to-end, one considers use of a single model optimized extract semantic information directly from speech signal.A major issue such models is lack paired audio and textual data with annotation.In this paper, we propose an approach build in scenario which zero available.Our based on external trained generate sequence vectorial representations text.These...

10.21437/interspeech.2022-10231 article EN Interspeech 2022 2022-09-16

Qualitative Evaluation of ASR Adaptation in a Lecture Context: Application to the PASTEL Corpus

OPENALEX - Publications

Salima Mdhaffar Yannick Estève Nicolás Hernández Antoine Laurent Richard Dufour and 1 more

10.21437/interspeech.2019-2661 article EN Interspeech 2022 2019-09-13

ON-TRAC Consortium Systems for the IWSLT 2023 Dialectal and Low-resource Speech Translation Tasks

OPENALEX - Publications

Antoine Laurent Souhir Gahbiche Ha-Thanh Nguyen Haroun Elleuch Fethi Bougares and 7 more

Antoine Laurent, Souhir Gahbiche, Ha Nguyen, Haroun Elleuch, Fethi Bougares, Thiol, Hugo Riguidel, Salima Mdhaffar, Gaëlle Laperrière, Lucas Maison, Sameer Khurana, Yannick Estève. Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023). 2023.

10.18653/v1/2023.iwslt-1.18 article EN cc-by 2023-01-01

ON-TRAC’ systems for the IWSLT 2021 low-resource speech translation and multilingual speech translation shared tasks

OPENALEX - Publications

Hang Le Florentin Barbier Ha-Thanh Nguyen Natalia Tomashenko Salima Mdhaffar and 5 more

Hang Le, Florentin Barbier, Ha Nguyen, Natalia Tomashenko, Salima Mdhaffar, Souhir Gabiche Gahbiche, Benjamin Lecouteux, Didier Schwab, Yannick Estève. Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021). 2021.

10.18653/v1/2021.iwslt-1.20 preprint EN cc-by 2021-01-01

Performance Analysis of Speech Encoders for Low-Resource SLU and ASR in Tunisian Dialect

OPENALEX - Publications

Salima Mdhaffar Haroun Elleuch Fethi Bougares Yannick Estève

Speech encoders pretrained through self-supervised learning (SSL) have demonstrated remarkable performance in various downstream tasks, including Spoken Language Understanding (SLU) and Automatic Recognition (ASR). For instance, fine-tuning SSL models for such tasks has shown significant potential, leading to improvements the SOTA across challenging datasets. In contrast existing research, this paper contributes by comparing effectiveness of approaches context (i) low-resource spoken...

10.48550/arxiv.2407.04533 preprint EN arXiv (Cornell University) 2024-07-05

Performance Analysis of Speech Encoders for Low-Resource SLU and ASR in Tunisian Dialect

OPENALEX - Publications

Salima Mdhaffar Haroun Elleuch Fethi Bougares Yannick Estève

10.18653/v1/2024.arabicnlp-1.12 article EN 2024-01-01

Sonos Voice Control Bias Assessment Dataset: A Methodology for Demographic Bias Assessment in Voice Assistants

OPENALEX - Publications

Chloé Sekkat F LeRoy Salima Mdhaffar Blake Perry Smith Yannick Estève and 2 more

Recent works demonstrate that voice assistants do not perform equally well for everyone, but research on demographic robustness of speech technologies is still scarce. This mainly due to the rarity large datasets with controlled tags. paper introduces Sonos Voice Control Bias Assessment Dataset, an open dataset composed assistant requests North American English in music domain (1,038 speakers, 166 hours, 170k audio samples, 9,040 unique labelled transcripts) a diversity (gender, age,...

10.48550/arxiv.2405.19342 preprint EN arXiv (Cornell University) 2024-05-14

Open-Source Conversational AI with SpeechBrain 1.0

OPENALEX - Publications

Mirco Ravanelli Titouan Parcollet Adel Moumen Sylvain de Langen Cem Subakan and 25 more

SpeechBrain is an open-source Conversational AI toolkit based on PyTorch, focused particularly speech processing tasks such as recognition, enhancement, speaker text-to-speech, and much more. It promotes transparency replicability by releasing both the pre-trained models complete "recipes" of code algorithms required for training them. This paper presents 1.0, a significant milestone in evolution toolkit, which now has over 200 recipes speech, audio, language tasks, more than 100 available...

10.48550/arxiv.2407.00463 preprint EN arXiv (Cornell University) 2024-06-29

Extraction of interpretable and shared speaker-specific speech attributes through binary auto-encoder

OPENALEX - Publications

Imen Ben Amor Jean-François Bonastre Salima Mdhaffar

In speaker recognition systems, embeddings lack explicit speaker-related information, posing challenges for interpretability. Recently, a binary representation of speech extracts, where coefficient indicates the presence or absence given voice attribute, has been proposed to overcome this lack. It consists an adaptation x-vector extractor followed by binarisation step. This approach proved its worth in terms explainability, but two shortcomings. Firstly, objective shared attribute modeling...

10.21437/interspeech.2024-1011 article EN Interspeech 2022 2024-09-01

Instrumentation of Learning Situation using Automated Speech Transcription: A Prototyping Approach

OPENALEX - Publications

Vincent Bettenfeld Salima Mdhaffar Christophe Choquet Claudine Piau-Toffolon

This paper presents the ongoing conception of a set tools, based on live transcription speech during lectures and designed to instrument traditional as well web conferences or hybrid learning situations. The toolset exploits interactions taking place courses, keeps track them facilitates their reuse both in students' studies future iterations course delivered by teacher. Its goal is help students stay focused teacher's explanations offer greater possibilities interactions. prototype was...

10.5220/0007722403590366 article EN cc-by-nc-nd 2019-01-01

Federated Learning for ASR based on Wav2vec 2.0

OPENALEX - Publications

Tuan Nguyen Salima Mdhaffar Natalia Tomashenko Jean-François Bonastre Yannick Estève

This paper presents a study on the use of federated learning to train an ASR model based wav2vec 2.0 pre-trained by self supervision. Carried out well-known TED-LIUM 3 dataset, our experiments show that such can obtain, with no language model, word error rate 10.92% official test set, without sharing any data from different users. We also analyse performance for speakers depending their participation learning. Since was first introduced privacy purposes, we measure its ability protect...

10.48550/arxiv.2302.10790 preprint EN cc-by arXiv (Cornell University) 2023-01-01

LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for Self-supervised Representations of French Speech

OPENALEX - Publications

Titouan Parcollet Ha-Thanh Nguyen Solène Evain Marcely Zanon Boito Adrien Pupier and 17 more

Self-supervised learning (SSL) is at the origin of unprecedented improvements in many different domains including computer vision and natural language processing. Speech processing drastically benefitted from SSL as most current domain-related tasks are now being approached with pre-trained models. This work introduces LeBenchmark 2.0 an open-source framework for assessing building SSL-equipped French speech technologies. It includes documented, large-scale heterogeneous corpora up to 14,000...

10.48550/arxiv.2309.05472 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Coming Soon ...