Gérard Chollet

ORCID: 0000-0003-4245-146X
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Speech Recognition and Synthesis
  • Speech and Audio Processing
  • Music and Audio Processing
  • Speech and dialogue systems
  • Advanced Data Compression Techniques
  • Natural Language Processing Techniques
  • Face recognition and analysis
  • Phonetics and Phonology Research
  • AI in Service Interactions
  • Biometric Identification and Security
  • Social Robot Interaction and HRI
  • Digital Media Forensic Detection
  • Context-Aware Activity Recognition Systems
  • Neural Networks and Applications
  • Face and Expression Recognition
  • Technology Use by Older Adults
  • Blind Source Separation Techniques
  • Video Analysis and Summarization
  • User Authentication and Security Systems
  • Voice and Speech Disorders
  • Advanced Image and Video Retrieval Techniques
  • Video Surveillance and Tracking Methods
  • Time Series Analysis and Forecasting
  • Multi-Agent Systems and Negotiation
  • Image Retrieval and Classification Techniques

Telecom SudParis
2009-2024

Institut Polytechnique de Paris
2009-2024

UCB Pharma (France)
2023-2024

Institut Mines-Télécom
2013-2024

Centre National de la Recherche Scientifique
2005-2021

Intelligent Health (United Kingdom)
2016-2021

Multimedia University
2007-2018

Télécom Paris
2008-2017

Laboratoire Traitement et Communication de l’Information
2008-2017

Laboratoire Traitement du Signal et de l'Image
2007-2016

Speech recordings are a rich source of personal, sensitive data that can be used to support plethora diverse applications, from health profiling biometric recognition. It is therefore essential speech adequately protected so they cannot misused. Such protection, in the form privacy-preserving technologies, required ensure that: (i) profiles given individual (e.g., across different service operators) unlinkable; (ii) leaked, encrypted information irreversible, and (iii) references renewable....

10.1016/j.csl.2019.06.001 article EN cc-by Computer Speech & Language 2019-06-08

This paper evaluates the performance of twelve primary systems submitted to evaluation on speaker verification in context a mobile environment using MOBIO database. The provides challenging and realistic test-bed for current state-of-the-art techniques. Results terms equal error rate (EER), half total (HTER) detection trade-off (DET) confirm that best performing are based variability modeling, fusion several sub-systems. Nevertheless, good old UBM-GMM still competitive. results also show use...

10.1109/icb.2013.6613025 preprint EN 2013-06-01

Voice-based digital Assistants such as Apple's Siri and Google's Now are currently booming. Yet, despite their promise of being context-aware adapted to a user's preferences very distinct needs, truly personal assistants still missing. In this paper we highlight some the challenges in building personalized speech-operated assistive technology propose number research development directions have undertaken order solve them. particular focus on natural language understanding dialog management...

10.1109/atsip.2014.6834655 preprint EN 2014-03-01

With a substantial rise in life expectancy throughout the last century, society faces imperative of seeking inventive approaches to foster active aging and provide adequate care. The e-VITA initiative, jointly funded by European Union Japan, centers on an advanced virtual coaching methodology designed target essential aspects promoting healthy aging. This paper describes technical framework underlying system platform presents preliminary feedback its use. At core is Manager, pivotal...

10.3390/s24020638 article EN cc-by Sensors 2024-01-19

Congenital hypothyroidism (CH) is a leading preventable cause of intellectual developmental disorders, with prevalence 1 in 2,000 to 4,000 newborns. Neonatal screening programs play crucial role early detection and prevention long-term neurodevelopmental consequences. This article presents the case 6-year-old female patient history delayed growth neuropsychomotor development due CH, exacerbated by poor adherence levothyroxine treatment. The exhibited typical clinical features including short...

10.34119/bjhrv8n1-473 article EN Brazilian Journal of Health Review 2025-02-25

10.1109/vrw66409.2025.00106 article EN 2022 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW) 2025-03-08

The article deals with a technique of voice forgery using the ALISP (automatic language independent speech processing) approach. Such allows an arbitrary person (the impostor) to be transformed, forging identity another client). Our goal is demonstrate that automatic speaker recognition system could seriously threatened by transformation this kind. For purpose, we use verification calculate likelihood forged belongs genuine client. Experiments on NIST 2004 evaluation data show equal error...

10.1109/icassp.2005.1415039 article EN 2006-10-11

Abstract A recent article in Phonetica has shown the effect of linguistic and paralinguistic factors on frequency elision French mute-e. The present study deals primarily with its spectral characteristics identifiability. It also describes a new computer-assisted technique acoustic analysis which advantageously replaces conventional spectrograms. results show that entity question hardly half phonetic individuality other vowels, while it is closest color to /ø/, only as discrete.

10.1159/000259866 article EN Phonetica 1977-01-01

This paper presents recent developments on our “silent speech interface” that converts tongue and lip motions, captured by ultrasound video imaging, into audible speech. In previous studies, the mapping between observed articulatory movements resulting sound was achieved using a unit selection approach. We investigate here use of statistical techniques, based joint modeling visual spectral features, respectively Gaussian Mixture Models (GMM ) Hidden Markov (HMM). The prediction...

10.21437/interspeech.2011-239 article EN Interspeech 2022 2011-08-27

Recent product releases such as Apple's Siri and Google's Voice Search have strongly emphasized the use of voice a modern interaction modality. Seniors, in particular, might appreciate an alternative to small mobile phone keypads, touchpads computer mice. This paper presents initial explorations how elderly people would interact with language-technology-driven interfaces, these interactions measure up against traditional physical channels, what features they may require satisfy needs this...

10.1145/2504335.2504391 preprint EN 2013-05-29

Speech is a means of communication which intrinsically bimodal: the audio signal originates from dynamics articulators. This paper reviews recent works in field audiovisual speech, and more specifically techniques developed to measure level correspondence between visual speech. It overviews most common speech front-end processing, transformations performed on audio, visual, or joint feature spaces, actual Finally, use synchrony for biometric identity verification based talking faces...

10.1155/2007/70186 article EN cc-by EURASIP Journal on Advances in Signal Processing 2007-05-07

The paper compares, on a database recorded in car, number of signal analysis and speech enhancement techniques as well some approaches to adapt recognition systems. It is shown that new nonlinear spectral subtraction associated with Mel frequency cepstral coefficients (MFCC) an adequate compromise for low-cost integration. Lombard effect analyzed simulated. Such simulation used derive realistic training utterances from noise-free utterances. Adapting continuous-density hidden Markov model...

10.1109/89.466660 article EN IEEE Transactions on Speech and Audio Processing 1995-01-01

We investigate the use of audio-visual speech synchrony measure in framework identity verification based on talking faces. Two measures canonical correlation analysis and co-inertia respectively are introduced their performances evaluated specific task detecting synchronized not-synchronized sequences. The notion high-effort impostor attacks is also as a dangerous threat for current biometric system speaker face recognition. A novel modality order to improve overall performance verification,...

10.1109/icassp.2007.366215 preprint EN 2007-01-01

This paper presents a strategy for enabling speech recognition to be performed in the cloud whilst preserving privacy of users. The approach advocates demarcation responsibilities between client and server-side components performing task. On client-side resides acoustic model, which symbolically encodes audio encrypts data before uploading server. then employs searchable encryption enable phonetic search content. Some preliminary results encoding are presented.

10.1109/icassp.2017.7953391 preprint EN 2017-03-01

This paper reports on the results of first lab trials evaluating vAssist (Voice Controlled Assistive Care and Communication Services for Home) system prototype with Italian users. is an European Project aiming to provide specific voice controlled home care communication services elderly. An important objective a multilingual Voice User Interface (VUI) in three different languages: Italian, French German. Lab were foreseen these countries assess VUI realistic user expectations requirements....

10.1109/coginfocom.2014.7020425 preprint EN 2014-11-01

Since life expectancy has increased significantly over the past century, society is being forced to discover innovative ways support active aging and elderly care. The e-VITA project, which receives funding from both European Union Japan, built on a cutting edge method of virtual coaching that focuses key areas healthy aging. requirements for coach were ascertained through process participatory design in workshops, focus groups, living laboratories Germany, France, Italy, Japan. Several use...

10.3390/s23052748 article EN cc-by Sensors 2023-03-02

Support vector machines (SVM) is a new and very promising classification technique developed from the theory of structural risk minimisation. We propose an alternative out-of-vocabulary word detection method relying on confidence measures support machines. Confidence are computed phone level information provided by hidden Markov model (HMM) based speech recognizer. use three kinds average techniques as arithmetic, geometric harmonic averages to compute measure for each word. The...

10.1109/icassp.2003.1198849 article EN 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003-11-20
Coming Soon ...