- Speech Recognition and Synthesis
- Speech and Audio Processing
- Music and Audio Processing
- Speech and dialogue systems
- Advanced Data Compression Techniques
- Natural Language Processing Techniques
- Face recognition and analysis
- Phonetics and Phonology Research
- AI in Service Interactions
- Biometric Identification and Security
- Social Robot Interaction and HRI
- Digital Media Forensic Detection
- Context-Aware Activity Recognition Systems
- Neural Networks and Applications
- Face and Expression Recognition
- Technology Use by Older Adults
- Blind Source Separation Techniques
- Video Analysis and Summarization
- User Authentication and Security Systems
- Voice and Speech Disorders
- Advanced Image and Video Retrieval Techniques
- Video Surveillance and Tracking Methods
- Time Series Analysis and Forecasting
- Multi-Agent Systems and Negotiation
- Image Retrieval and Classification Techniques
Telecom SudParis
2009-2024
Institut Polytechnique de Paris
2009-2024
UCB Pharma (France)
2023-2024
Institut Mines-Télécom
2013-2024
Centre National de la Recherche Scientifique
2005-2021
Intelligent Health (United Kingdom)
2016-2021
Multimedia University
2007-2018
Télécom Paris
2008-2017
Laboratoire Traitement et Communication de l’Information
2008-2017
Laboratoire Traitement du Signal et de l'Image
2007-2016
Speech recordings are a rich source of personal, sensitive data that can be used to support plethora diverse applications, from health profiling biometric recognition. It is therefore essential speech adequately protected so they cannot misused. Such protection, in the form privacy-preserving technologies, required ensure that: (i) profiles given individual (e.g., across different service operators) unlinkable; (ii) leaked, encrypted information irreversible, and (iii) references renewable....
This paper evaluates the performance of twelve primary systems submitted to evaluation on speaker verification in context a mobile environment using MOBIO database. The provides challenging and realistic test-bed for current state-of-the-art techniques. Results terms equal error rate (EER), half total (HTER) detection trade-off (DET) confirm that best performing are based variability modeling, fusion several sub-systems. Nevertheless, good old UBM-GMM still competitive. results also show use...
Voice-based digital Assistants such as Apple's Siri and Google's Now are currently booming. Yet, despite their promise of being context-aware adapted to a user's preferences very distinct needs, truly personal assistants still missing. In this paper we highlight some the challenges in building personalized speech-operated assistive technology propose number research development directions have undertaken order solve them. particular focus on natural language understanding dialog management...
With a substantial rise in life expectancy throughout the last century, society faces imperative of seeking inventive approaches to foster active aging and provide adequate care. The e-VITA initiative, jointly funded by European Union Japan, centers on an advanced virtual coaching methodology designed target essential aspects promoting healthy aging. This paper describes technical framework underlying system platform presents preliminary feedback its use. At core is Manager, pivotal...
Congenital hypothyroidism (CH) is a leading preventable cause of intellectual developmental disorders, with prevalence 1 in 2,000 to 4,000 newborns. Neonatal screening programs play crucial role early detection and prevention long-term neurodevelopmental consequences. This article presents the case 6-year-old female patient history delayed growth neuropsychomotor development due CH, exacerbated by poor adherence levothyroxine treatment. The exhibited typical clinical features including short...
The article deals with a technique of voice forgery using the ALISP (automatic language independent speech processing) approach. Such allows an arbitrary person (the impostor) to be transformed, forging identity another client). Our goal is demonstrate that automatic speaker recognition system could seriously threatened by transformation this kind. For purpose, we use verification calculate likelihood forged belongs genuine client. Experiments on NIST 2004 evaluation data show equal error...
Abstract A recent article in Phonetica has shown the effect of linguistic and paralinguistic factors on frequency elision French mute-e. The present study deals primarily with its spectral characteristics identifiability. It also describes a new computer-assisted technique acoustic analysis which advantageously replaces conventional spectrograms. results show that entity question hardly half phonetic individuality other vowels, while it is closest color to /ø/, only as discrete.
This paper presents recent developments on our “silent speech interface” that converts tongue and lip motions, captured by ultrasound video imaging, into audible speech. In previous studies, the mapping between observed articulatory movements resulting sound was achieved using a unit selection approach. We investigate here use of statistical techniques, based joint modeling visual spectral features, respectively Gaussian Mixture Models (GMM ) Hidden Markov (HMM). The prediction...
Recent product releases such as Apple's Siri and Google's Voice Search have strongly emphasized the use of voice a modern interaction modality. Seniors, in particular, might appreciate an alternative to small mobile phone keypads, touchpads computer mice. This paper presents initial explorations how elderly people would interact with language-technology-driven interfaces, these interactions measure up against traditional physical channels, what features they may require satisfy needs this...
Speech is a means of communication which intrinsically bimodal: the audio signal originates from dynamics articulators. This paper reviews recent works in field audiovisual speech, and more specifically techniques developed to measure level correspondence between visual speech. It overviews most common speech front-end processing, transformations performed on audio, visual, or joint feature spaces, actual Finally, use synchrony for biometric identity verification based talking faces...
The paper compares, on a database recorded in car, number of signal analysis and speech enhancement techniques as well some approaches to adapt recognition systems. It is shown that new nonlinear spectral subtraction associated with Mel frequency cepstral coefficients (MFCC) an adequate compromise for low-cost integration. Lombard effect analyzed simulated. Such simulation used derive realistic training utterances from noise-free utterances. Adapting continuous-density hidden Markov model...
We investigate the use of audio-visual speech synchrony measure in framework identity verification based on talking faces. Two measures canonical correlation analysis and co-inertia respectively are introduced their performances evaluated specific task detecting synchronized not-synchronized sequences. The notion high-effort impostor attacks is also as a dangerous threat for current biometric system speaker face recognition. A novel modality order to improve overall performance verification,...
This paper presents a strategy for enabling speech recognition to be performed in the cloud whilst preserving privacy of users. The approach advocates demarcation responsibilities between client and server-side components performing task. On client-side resides acoustic model, which symbolically encodes audio encrypts data before uploading server. then employs searchable encryption enable phonetic search content. Some preliminary results encoding are presented.
This paper reports on the results of first lab trials evaluating vAssist (Voice Controlled Assistive Care and Communication Services for Home) system prototype with Italian users. is an European Project aiming to provide specific voice controlled home care communication services elderly. An important objective a multilingual Voice User Interface (VUI) in three different languages: Italian, French German. Lab were foreseen these countries assess VUI realistic user expectations requirements....
Since life expectancy has increased significantly over the past century, society is being forced to discover innovative ways support active aging and elderly care. The e-VITA project, which receives funding from both European Union Japan, built on a cutting edge method of virtual coaching that focuses key areas healthy aging. requirements for coach were ascertained through process participatory design in workshops, focus groups, living laboratories Germany, France, Italy, Japan. Several use...
Support vector machines (SVM) is a new and very promising classification technique developed from the theory of structural risk minimisation. We propose an alternative out-of-vocabulary word detection method relying on confidence measures support machines. Confidence are computed phone level information provided by hidden Markov model (HMM) based speech recognizer. use three kinds average techniques as arithmetic, geometric harmonic averages to compute measure for each word. The...