NFDI4DS | UHH-SEMS - Publication Details

Hugo Van hamme

ORCID: 0000-0003-1331-5186

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5087514947

Research Areas

Speech Recognition and Synthesis
Speech and Audio Processing
Music and Audio Processing
Natural Language Processing Techniques
Speech and dialogue systems
Blind Source Separation Techniques
Topic Modeling
EEG and Brain-Computer Interfaces
Advanced Adaptive Filtering Techniques
Phonetics and Phonology Research
Advanced Data Compression Techniques
Neural Networks and Applications
Neural dynamics and brain function
Voice and Speech Disorders
Language Development and Disorders
Fault Detection and Control Systems
Control Systems and Identification
Hearing Loss and Rehabilitation
Multimodal Machine Learning Applications
Structural Health Monitoring Techniques
Domain Adaptation and Few-Shot Learning
Advanced Electrical Measurement Techniques
Music Technology and Sound Studies
Direction-of-Arrival Estimation Techniques
Video Analysis and Summarization

KU Leuven
2016-2025

École Supérieure des Arts Saint-Luc de Liège
2018

University of Lomé
2016

iMinds
2016

Radboud University Nijmegen
2009

Vrije Universiteit Brussel
1987-2003

Fund for Scientific Research
2003

Vrije Universiteit Amsterdam
1992

Parametric identification of transfer functions in the frequency domain-a survey

OPENALEX - Publications

Rik Pintelon Patrick Guillaume Yves Rolain J. Schoukens Hugo Van hamme

This paper gives a survey of frequency domain identification methods for rational transfer functions in the Laplace (s) or z-domain. The interrelations between different approaches are highlighted through study (equivalent) cost functions. properties various estimators discussed and illustrated by several examples.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">></ETX>

10.1109/9.333769 article EN IEEE Transactions on Automatic Control 1994-01-01

The interpolated fast Fourier transform: a comparative study

OPENALEX - Publications

J. Schoukens Rik Pintelon Hugo Van hamme

The properties of five interpolating fast Fourier transform (IFFT) methods are studied with respect to their systematic errors and noise sensitivity, for a monofrequency signal. It is shown that windows small spectral side lobes do not always result in better overall performance the IFFT method time-domain estimators can be more efficient than analyzed methods.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">></ETX>

10.1109/19.137352 article EN IEEE Transactions on Instrumentation and Measurement 1992-04-01

Decoding of the speech envelope from EEG using the VLAAI deep neural network

OPENALEX - Publications

Bernd Accou Jonas Vanthornhout Hugo Van hamme Tom Francart

Abstract To investigate the processing of speech in brain, commonly simple linear models are used to establish a relationship between brain signals and features. However, these ill-equipped model highly-dynamic, complex non-linear system like they often require substantial amount subject-specific training data. This work introduces novel decoder architecture: Very Large Augmented Auditory Inference (VLAAI) network. The VLAAI network outperformed state-of-the-art subject-independent (median...

10.1038/s41598-022-27332-2 article EN cc-by Scientific Reports 2023-01-16

SparrKULee: A Speech-Evoked Auditory Response Repository from KU Leuven, Containing the EEG of 85 Participants

OPENALEX - Publications

Bernd Accou Lies Bollens Marlies Gillis Wendy Verheijen Hugo Van hamme and 1 more

Researchers investigating the neural mechanisms underlying speech perception often employ electroencephalography (EEG) to record brain activity while participants listen spoken language. The high temporal resolution of EEG enables study responses fast and dynamic signals. Previous studies have successfully extracted characteristics from data and, conversely, predicted features. Machine learning techniques are generally employed construct encoding decoding models, which necessitate a...

10.3390/data9080094 article EN cc-by Data 2024-07-26

Compressive Sensing for Missing Data Imputation in Noise Robust Speech Recognition

OPENALEX - Publications

Jort F. Gemmeke Hugo Van hamme Bert Cranen Lou Boves

An effective way to increase the noise robustness of automatic speech recognition is label noisy features as either reliable or unreliable (missing), and replace (impute) missing ones by clean estimates. Conventional imputation techniques employ parametric models impute on a frame-by-frame basis. At low signal-to-noise ratios (SNRs), these fail, because too many time frames may contain few, if any, features. In this paper, we introduce novel non-parametric, exemplar-based method for...

10.1109/jstsp.2009.2039171 article EN IEEE Journal of Selected Topics in Signal Processing 2010-03-01

An exemplar-based NMF approach to audio event detection

OPENALEX - Publications

Jort F. Gemmeke Lode Vuegen Peter Karsmakers Bart Vanrumste Hugo Van hamme

We present a novel, exemplar-based method for audio event detection based on non-negative matrix factorisation. Building recent work in noise robust automatic speech recognition, we model events as linear combination of dictionary atoms, and mixtures overlapping events. The weights activated atoms an observation serve directly evidence the underlying classes. span multiple frames are created by extracting all possible fixed-length exemplars from training data. To combat data scarcity small...

10.1109/waspaa.2013.6701847 article EN 2013-10-01

Accent recognition using i-vector, Gaussian Mean Supervector and Gaussian posterior probability supervector for spontaneous telephone speech

OPENALEX - Publications

Mohamad Hasan Bahari Rahim Saeidi Hugo Van hamme David A. van Leeuwen

In this paper, three utterance modelling approaches, namely Gaussian Mean Supervector (GMS), i-vector and Posterior Probability (GPPS), are applied to the accent recognition problem.For each modeling method, different classifiers, Support Vector Machine (SVM), Naive Bayesian Classifier (NBC) Sparse Representation (SRC), employed find out suitable matches between schemes classifiers.The evaluation database is formed by using English utterances of speakers whose native languages Russian,...

10.1109/icassp.2013.6639089 article EN IEEE International Conference on Acoustics Speech and Signal Processing 2013-05-01

Speaker age estimation using i-vectors

OPENALEX - Publications

Mohamad Hasan Bahari Mitchell McLaren Hugo Van hamme David A. van Leeuwen

10.1016/j.engappai.2014.05.003 article EN Engineering Applications of Artificial Intelligence 2014-06-07

Unseen Noise Estimation Using Separable Deep Auto Encoder for Speech Enhancement

OPENALEX - Publications

Meng Sun Xiongwei Zhang Hugo Van hamme Thomas Fang Zheng

Unseen noise estimation is a key yet challenging step to make speech enhancement algorithm work in adverse environments. At worst, the only prior knowledge we know about encountered that it different from involved speech. Therefore, by subtracting components which cannot be adequately represented well defined model, noises can estimated and removed. Given good performance of deep learning signal representation, auto encoder (DAE) employed this for accurately modeling clean spectrum. In...

10.1109/taslp.2015.2498101 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2015-11-05

Identification of linear dynamic systems using piecewise constant excitations: Use, misuse and alternatives

OPENALEX - Publications

J. Schoukens Rik Pintelon Hugo Van hamme

10.1016/0005-1098(94)90211-9 article EN Automatica 1994-07-01

HAC-models: a novel approach to continuous speech recognition

OPENALEX - Publications

Hugo Van hamme

In this paper, a bottom-up, activation-based paradigm for continuous speech recognition is described. Speech described by co-occurrence statistics of acoustic events over an analysis window variable length, leading to vectorial representation high but fixed dimension called “Histogram Acoustic Co-occurrence” (HAC). During training, recurring patterns are discovered and associated words through non-negative matrix factorisation. testing, word activations computed from the HACrepresentation...

10.21437/interspeech.2008-633 article EN Interspeech 2022 2008-09-22

Age estimation from telephone speech using i-vectors

OPENALEX - Publications

Mohamad Hasan Bahari Mitchell McLaren Hugo Van hamme David A. van Leeuwen

Motivated by the success of i-vectors in field speaker recognition, this paper proposes a new approach for age estimation from telephone speech patterns based on i-vectors.In method, each utterance is modeled its corresponding ivector.Then, Support Vector Regression (SVR) applied to estimate speakers.The proposed method trained and tested conversations National Institute Standard Technology (NIST) 2010 2008 Speaker Recognition Evaluations databases.Evaluation results show that outperforms...

10.21437/interspeech.2012-169 article EN Interspeech 2022 2012-09-09

An LSTM Based Architecture to Relate Speech Stimulus to Eeg

OPENALEX - Publications

Mohammad Jalilpour Monesi Bernd Accou Jair Montoya-Martínez Tom Francart Hugo Van hamme

Modeling the relationship between natural speech and a recorded electroencephalogram (EEG) helps us understand how brain processes has various applications in neuroscience brain-computer interfaces. In this context, so far mainly linear models have been used. However, decoding performance of model is limited due to complex highly non-linear nature auditory processing human brain. We present novel Long Short-Term Memory (LSTM)-based architecture as nonlinear for classification problem whether...

10.1109/icassp40776.2020.9054000 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020-04-09

Using Adapters to Overcome Catastrophic Forgetting in End-to-End Automatic Speech Recognition

OPENALEX - Publications

Steven Vander Eeckt Hugo Van hamme

Learning a set of tasks in sequence remains challenge for artificial neural networks, which, such scenarios, tend to suffer from Catastrophic Forgetting (CF). The same applies End-to-End (E2E) Automatic Speech Recognition (ASR) models, even monolingual tasks. In this paper, we aim overcome CF E2E ASR by inserting adapters, small architectures few parameters which allow general model be fine-tuned specific task, into our model. We make these adapters task-specific, while regularizing the...

10.1109/icassp49357.2023.10095837 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023-05-05

Auditory EEG decoding challenge for ICASSP 2024

OPENALEX - Publications

Lies Bollens Corentin Puffay Bernd Accou Jonas Vanthornhout Hugo Van hamme and 1 more

10.1109/ojsp.2025.3534122 article EN cc-by IEEE Open Journal of Signal Processing 2025-01-01

Natural language processing‐based classification of early Alzheimer's disease from connected speech

OPENALEX - Publications

Helena Balabin Bastiaan Tamm Laure Spruyt Nathalie Dusart Ines Kabouche and 7 more

Abstract INTRODUCTION The automated analysis of connected speech using natural language processing (NLP) emerges as a possible biomarker for Alzheimer's disease (AD). However, it remains unclear which types are most sensitive and specific the detection AD. METHODS We applied model to automatically transcribed from 114 Flemish‐speaking individuals first distinguish early AD patients amyloid negative cognitively unimpaired (CU) then positive CU five different speech. RESULTS was able between...

10.1002/alz.14530 article EN cc-by-nc-nd Alzheimer s & Dementia 2025-01-27

Leveraging Broadcast Media Subtitle Transcripts for Automatic Speech Recognition and Subtitling

OPENALEX - Publications

Jakob Poncelet Hugo Van hamme

The recent advancement of speech recognition technology has been driven by large-scale datasets and attention-based architectures, but many challenges still remain, especially for low-resource languages dialects. This paper explores the integration weakly supervised transcripts from TV subtitles into automatic (ASR) systems, aiming to improve both verbatim transcriptions automatically generated subtitles. To this end, data are regarded as different domains or languages, due their distinct...

10.48550/arxiv.2502.03212 preprint EN arXiv (Cornell University) 2025-02-05

A Study on Model Training Strategies for Speaker-Independent and Vocabulary-Mismatched Dysarthric Speech Recognition

OPENALEX - Publications

Jinzi Qi Hugo Van hamme

Automatic speech recognition (ASR) systems often struggle to recognize from individuals with dysarthria, a disorder neuromuscular causes, accuracy declining further for unseen speakers and content. Achieving robustness such situations requires ASR address speaker-independent vocabulary-mismatched scenarios, minimizing user adaptation effort. This study focuses on comprehensive training strategies methods tackle these challenges, leveraging the transformer-based Wav2Vec2.0 model. Unlike prior...

10.3390/app15042006 article EN cc-by Applied Sciences 2025-02-14

Self-Incremental Training for Personalized Voice Command Recognition in a Wireless Audio Sensor Network

OPENALEX - Publications

Manuele Rusci Hugo Van hamme Tinne Tuytelaars

10.1109/icassp49660.2025.10887592 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025-03-12

Disentangled-Transformer: An Explainable End-to-End Automatic Speech Recognition Model with Speech Content-Context Separation

OPENALEX - Publications

Pu Wang Hugo Van hamme

10.1109/ipas63548.2025.10924511 article EN 2025-01-09

Developing a reading tutor: Design and evaluation of dedicated speech recognition and synthesis modules

OPENALEX - Publications

Jacques Duchateau Yuk On Kong Leen Cleuren Lukas Latacz Jan Roelens and 5 more

10.1016/j.specom.2009.04.010 article EN Speech Communication 2009-05-05

Character-Word LSTM Language Models

OPENALEX - Publications

Lyan Verwimp Joris Pelemans Hugo Van hamme Patrick Wambacq

We present a Character-Word Long Short-Term Memory Language Model which both reduces the perplexity with respect to baseline word-level language model and number of parameters model. Character information can reveal structural (dis)similarities between words even be used when word is out-of-vocabulary, thus improving modeling infrequent unknown words. By concatenating character embeddings, we achieve up 2.77% relative improvement on English compared similar amount 4.57% Dutch. Moreover, also...

10.18653/v1/e17-1040 article EN cc-by 2017-01-01

Coming Soon ...