Carmén García Mateo

ORCID: 0000-0001-6856-939X
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Speech Recognition and Synthesis
  • Speech and Audio Processing
  • Music and Audio Processing
  • Speech and dialogue systems
  • Natural Language Processing Techniques
  • Advanced Data Compression Techniques
  • Galician and Iberian cultural studies
  • Phonetics and Phonology Research
  • Biometric Identification and Security
  • Spanish Linguistics and Language Studies
  • User Authentication and Security Systems
  • Voice and Speech Disorders
  • Advanced Adaptive Filtering Techniques
  • Hand Gesture Recognition Systems
  • Digital Filter Design and Implementation
  • Emotion and Mood Recognition
  • Topic Modeling
  • Video Analysis and Summarization
  • Face recognition and analysis
  • Mental Health via Writing
  • Journalism and Media Studies
  • Multi-Agent Systems and Negotiation
  • Hearing Impairment and Communication
  • Linguistic Studies and Language Acquisition
  • Face and Expression Recognition

Universidade de Vigo
2011-2025

Centro Tecnolóxico de Telecomunicacións de Galicia
2015-2017

Universidade de Santiago de Compostela
2016

Vicinay Cadenas (Spain)
2016

Universidad de Sonora
2016

Telefonica Research and Development
2008

The Dialogue
2002

European Telecommunications Standards Institute
2000-2002

IBM (United States)
1991

A new multimodal biometric database designed and acquired within the framework of European BioSecure Network Excellence is presented. It comprised more than 600 individuals simultaneously in three scenarios: 1) over Internet, 2) an office environment with desktop PC, 3) indoor/outdoor environments mobile portable hardware. The scenarios include a common part audio/video data. Also, signature fingerprint data have been both PC Additionally, hand iris were second scenario using PC. Acquisition...

10.1109/tpami.2009.76 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2009-04-18

Virtual assistants (VAs) have gained widespread popularity across a wide range of applications, and the integration Large Language Models (LLMs), such as ChatGPT, has opened up new possibilities for developing even more sophisticated VAs. However, this poses ethical issues challenges that must be carefully considered, particularly these systems are increasingly used in public services: transfer personal data, decision-making transparency, potential biases, privacy risks. This paper, an...

10.3390/electronics12143170 article EN Electronics 2023-07-21

Large language models (LLMs) have revolutionized the field of artificial intelligence in both academia and industry, transforming how we communicate, search for information, create content. However, these face knowledge cutoffs costly updates, driving a new ecosystem LLM-based applications that leverage interaction techniques to extend capabilities facilitate updates. As grow more complex, understanding their internal workings becomes increasingly challenging, posing significant issues...

10.3390/app15031192 article EN cc-by Applied Sciences 2025-01-24

Virtual assistants (VAs) have gained widespread popularity across a wide range of applications, and the integration Large Language Models (LLMs) such as ChatGPT has opened up new possibilities for developing even more sophisticated VAs. However, this poses ethical issues challenges that must be carefully considered, particularly these systems are increasingly used in public services: transfer personal data, decision-making transparency, potential biases, privacy risks. This paper, an...

10.20944/preprints202306.0196.v1 preprint EN 2023-06-02

Despite of the advances in e-learning domain during last decades, there is a lack suitable mechanism to carry out assessment with appropriate measures avoid cheating. Current LMSs do not provide needed features check that intended student taking online exam by himself, or even know if he has spent whole session time front computer. This paper presents web-based application offers biometric authentication based on face-recognition. application, which can be easily integrated currently...

10.1109/icalt.2008.184 article EN 2008-01-01

Clinical depression can be considered as a soft biometric trait that help to characterize an individual. This mood disorder involved in forensic psychological assessment, due its relevance different legal issues. The automatic detection of depressed speech has been object research the last years, resulting algorithmic approaches and acoustic features. Due use algorithms, databases performance measures, deciding which ones are more suitable for this task is difficult. In work, features was...

10.1109/iwbf.2014.6914245 article EN 2014-03-01

Beside the optimization of biometric error rates overall security system performance in respect to intentional attacks plays an important role for enabled authentication schemes. As traditionally most user schemes are knowledge and/or possession based, firstly this paper we present a methodology analysis Internet-based systems by enhancing known methodologies such as CERT attack-taxonomy with more detailed view on OSI-Model. Secondly proof concept, guidelines extracted from strictly applied...

10.1117/12.767632 article EN Proceedings of SPIE, the International Society for Optical Engineering/Proceedings of SPIE 2008-02-14

This paper describes a large scale experiment in which eight research institutions have tested their audio partitioning and labeling algorithms on the same data, multi-lingual database of news broadcasts, using evaluation tools protocols. The experiments provide more insight cross-lingual robustness methods they demonstrated that by further collaborating thedomains speaker change detection clustering it should be possible to achieve technological progress near future.

10.21437/interspeech.2005-68 article EN Interspeech 2022 2005-09-04

Spoken term detection (STD) aims at retrieving data from a speech repository given textual representation of the search term. Nowadays, it is receiving much interest due to large volume multimedia information. STD differs automatic recognition (ASR) in that ASR interested all terms/words appear data, whereas focuses on selected list terms must be detected within data. This paper presents systems submitted ALBAYZIN 2014 evaluation, held as part evaluation campaign context IberSPEECH...

10.1186/s13636-015-0063-8 article EN cc-by EURASIP Journal on Audio Speech and Music Processing 2015-08-06

This paper addresses the challenge of integrating low-resource languages into multilingual automatic speech recognition (ASR) systems. We introduce a novel application weighted cross-entropy, typically used for unbalanced datasets, to facilitate integration pre-trained ASR models within context continual learning. fine-tune Whisper model on five high-resource and one language, employing language-weighted dynamic cross-entropy data augmentation. The results show remarkable 6.69% word error...

10.21437/interspeech.2024-734 preprint EN Interspeech 2022 2024-09-01

Soft biometrics comprises the biological traits that are not sufficient for person authentication but can help to narrow search space. Evidence of mental health state be considered as a soft biometric, it provides valuable information about identity an individual. Different approaches have been used automatic classification speech in "depressed" or "non-depressed", differences algorithms, features, databases and performance measures make difficult draw conclusions which features techniques...

10.1109/mipro.2014.6859774 article EN 2014-05-01

Cross-lingual query-by-example spoken term detection (QbE STD) has caught the attention of speech researchers, as it makes possible to develop systems for low-resource languages, in which available amount labelled data training automatic recognition approaches prohibitive. The use phonetic posteriorgrams representation combined with dynamic time warping search is a widely used approach this task, but little been focused suitability set units represent information different language. This...

10.1109/asru.2015.7404798 article EN 2015-12-01

In this paper we present a fast method to implement language model (LM) look-ahead algorithm in Viterbi-based, single-lexical-tree speech recognizer. We have used three different mechanisms speed up the calculation: cache memory attached each node or network, pre-calculation of probabilities active contexts, and an organization LM using perfect hash. These enhancements make it possible use full trigram compute with better overall results, both terms recognition rate computation time, than...

10.1109/icassp.2002.5743815 article EN IEEE International Conference on Acoustics Speech and Signal Processing 2002-05-01

The paper deals with the task of audio segmentation in TV broadcast news. A multimedia approach for this purpose, by means and video processing, is proposed. Thus, system composed two differentiated parts: one analyzes stream, based on well-known Bayesian information criterion (BIC), whereas other part extracts useful from stream to improve performance BIC. An investigation parameters involved BIC formulation also accomplished, order achieve best results possible our experimental framework:...

10.1109/icassp.2004.1325999 article EN IEEE International Conference on Acoustics Speech and Signal Processing 2004-09-28
Coming Soon ...