- Natural Language Processing Techniques
- Speech Recognition and Synthesis
- Speech and dialogue systems
- Topic Modeling
- Music and Audio Processing
- Speech and Audio Processing
- Advanced Text Analysis Techniques
- Sentiment Analysis and Opinion Mining
- Phonetics and Phonology Research
- Emotion and Mood Recognition
- Neural Networks and Applications
- Language, Linguistics, Cultural Analysis
- Multi-Agent Systems and Negotiation
- Advanced Data Compression Techniques
- Text and Document Classification Technologies
- Semantic Web and Ontologies
- Algorithms and Data Compression
- Social Robot Interaction and HRI
- French Language Learning Methods
- Linguistics and Discourse Analysis
- Text Readability and Simplification
- Risk and Safety Analysis
- Language, Metaphor, and Cognition
- Mathematics, Computing, and Information Processing
- Intelligent Tutoring Systems and Adaptive Learning
Laboratoire Informatique d'Avignon
2003-2024
Centre National de la Recherche Scientifique
2003-2021
Institut polytechnique de Grenoble
2021
Laboratoire d'Informatique de Grenoble
2021
Université Grenoble Alpes
2021
Centre Inria de l'Université Grenoble Alpes
2021
Université Nantes Angers Le Mans
2009-2020
Université d'Avignon et des Pays de Vaucluse
2003-2020
Le Mans Université
2009-2019
University of Maine
2006-2015
Antonios Anastasopoulos, Loïc Barrault, Luisa Bentivogli, Marcely Zanon Boito, Ondřej Bojar, Roldano Cattoni, Anna Currey, Georgiana Dinu, Kevin Duh, Maha Elbayad, Clara Emmanuel, Yannick Estève, Marcello Federico, Christian Federmann, Souhir Gahbiche, Hongyu Gong, Roman Grundkiewicz, Barry Haddow, Benjamin Hsu, Dávid Javorský, Vĕra Kloudová, Surafel Lakew, Xutai Ma, Prashant Mathur, Paul McNamee, Kenton Murray, Maria Nǎdejde, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, John...
Milind Agarwal, Sweta Agrawal, Antonios Anastasopoulos, Luisa Bentivogli, Ondřej Bojar, Claudia Borg, Marine Carpuat, Roldano Cattoni, Mauro Cettolo, Mingda Chen, William Khalid Choukri, Alexandra Chronopoulou, Anna Currey, Thierry Declerck, Qianqian Dong, Kevin Duh, Yannick Estève, Marcello Federico, Souhir Gahbiche, Barry Haddow, Benjamin Hsu, Phu Mon Htut, Hirofumi Inaguma, Dávid Javorský, John Judge, Yasumasa Kano, Tom Ko, Rishu Kumar, Pengwei Li, Xutai Ma, Prashant Mathur, Evgeny...
Dialectal Arabic (DA) is significantly different from the language taught in schools and used written communication formal speech (broadcast news, religion, politics, etc.). There are many existing researches field of Sentiment Analysis (SA); however, they generally restricted to Modern Standard (MSA) or some dialects economic political interest. In this paper we interested SA Tunisian Dialect. We utilize Machine Learning techniques determine polarity comments First, evaluate systems...
Named entity recognition (NER) is among SLU tasks that usually extract semantic information from textual documents. Until now, NER speech made through a pipeline process consists in processing first an automatic (ASR) on the audio and then ASR outputs. Such approach has some disadvantages (error propagation, metric to tune systems sub-optimal regards final task, reduced space search at output level,...) it known more integrated approaches outperform sequential ones, when they can be applied....
Self-Supervised Learning (SSL) using huge unlabeled data has been successfully explored for image and natural language processing. Recent works also investigated SSL from speech. They were notably successful to improve performance on downstream tasks such as automatic speech recognition (ASR). While these suggest it is possible reduce dependence labeled building efficient systems, their evaluation was mostly made ASR multiple heterogeneous experimental settings (most of them English). This...
Pre-training for feature extraction is an increasingly studied approach to get better continuous representations of audio and text content. In the present work, we use wav2vec camemBERT as self-supervised learned models represent our data in order perform emotion recognition from speech (SER) on AlloSat, a large French emotional database describing satisfaction dimension, state art corpus SEWA focusing valence, arousal liking dimensions. To authors' knowledge, this paper presents first study...
from teaching and research institutions in France or abroad, public private centers.L'archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques niveau recherche, publiés ou non, émanant des établissements d'enseignement recherche français étrangers, laboratoires publics privés.
In Self-Supervised Learning (SSL), pre-training and evaluation are resource intensive. the speech domain, current indicators of quality SSL models during pre-training, such as loss, do not correlate well with downstream performance. Consequently, it is often difficult to gauge final performance in a cost efficient manner pre-training. this work, we propose unsupervised methods that give insights into models, namely, measuring cluster rank embeddings model. Results show measures better than...
Self-supervised learning from raw speech has been proven beneficial to improve automatic recognition (ASR). We investigate here its impact on end-to-end translation (AST) performance. use a contrastive predic-tive coding (CPC) model pre-trained unlabeled as feature extractor for downstream AST task. show that self-supervised pre-training is particularly efficient in low resource settings and fine-tuning CPC models the training data further improves Even higher settings, ensembling trained...
Thanks to a remarkably great ability show amusement and engagement, laughter is one of the most important social markers in human interactions. Laughing together can actually help set up positive atmosphere favors creation new relationships. This paper presents data collection interaction dialogs involving humor between participant robot. In this work, scenarios have been designed order study such as laughter. They implemented within two automatic systems developed Joker project: dialog...
This work investigates speaker adaptation and transfer learning for spoken language understanding (SLU).We focus on the direct extraction of semantic tags from audio signal using an end-to-end neural network approach.We demonstrate that performance target predictive function slot filling task can be substantially improved by various knowledge approaches.First, we explore adaptive training (SAT) SLU models propose to use zero pseudo ivectors more efficient model initialization pretraining in...
This paper investigates methods to effectively retrieve speaker information from the personalized adapted neural network acoustic models (AMs) in automatic speech recognition (ASR). problem is especially important context of federated learning ASR where a global model learnt on server based updates received multiple clients. We propose an approach analyze AMs footprint so-called Indicator dataset. Using this method, we develop two attack that aim infer identity updated without access actual...
This paper presents the system used by LIUM to participate in ESTER, french broadcast news evaluation campaign. is based on CMU Sphinx 3.3 (fast) decoder. Some tools are presented which have been added different steps of recognition process: segmentation, acoustic model adaptation, word-lattice rescoring. Several experiments conducted studying effects signal segmentation process, injecting automatically transcribed data into training corpora, or testing approaches for adaptation. The results...
This paper addresses the problem of automatic speech recognition (ASR) error detection and their use for improving spoken language understanding (SLU) systems. In this study, SLU task consists in automatically extracting, from ASR transcriptions , semantic concepts concept/values pairs a e.g touristic information system. An approach is proposed enriching set labels with specific by using recently neural based on word embeddings to compute well calibrated confidence measures. Experimental...