Hugo Van hamme

ORCID: 0000-0003-1331-5186
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Speech Recognition and Synthesis
  • Speech and Audio Processing
  • Music and Audio Processing
  • Natural Language Processing Techniques
  • Speech and dialogue systems
  • Blind Source Separation Techniques
  • Topic Modeling
  • EEG and Brain-Computer Interfaces
  • Advanced Adaptive Filtering Techniques
  • Phonetics and Phonology Research
  • Advanced Data Compression Techniques
  • Neural Networks and Applications
  • Neural dynamics and brain function
  • Voice and Speech Disorders
  • Language Development and Disorders
  • Fault Detection and Control Systems
  • Control Systems and Identification
  • Hearing Loss and Rehabilitation
  • Multimodal Machine Learning Applications
  • Structural Health Monitoring Techniques
  • Domain Adaptation and Few-Shot Learning
  • Advanced Electrical Measurement Techniques
  • Music Technology and Sound Studies
  • Direction-of-Arrival Estimation Techniques
  • Video Analysis and Summarization

KU Leuven
2016-2025

École Supérieure des Arts Saint-Luc de Liège
2018

University of Lomé
2016

iMinds
2016

Radboud University Nijmegen
2009

Vrije Universiteit Brussel
1987-2003

Fund for Scientific Research
2003

Vrije Universiteit Amsterdam
1992

This paper gives a survey of frequency domain identification methods for rational transfer functions in the Laplace (s) or z-domain. The interrelations between different approaches are highlighted through study (equivalent) cost functions. properties various estimators discussed and illustrated by several examples.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">&gt;</ETX>

10.1109/9.333769 article EN IEEE Transactions on Automatic Control 1994-01-01

The properties of five interpolating fast Fourier transform (IFFT) methods are studied with respect to their systematic errors and noise sensitivity, for a monofrequency signal. It is shown that windows small spectral side lobes do not always result in better overall performance the IFFT method time-domain estimators can be more efficient than analyzed methods.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">&gt;</ETX>

10.1109/19.137352 article EN IEEE Transactions on Instrumentation and Measurement 1992-04-01

Abstract To investigate the processing of speech in brain, commonly simple linear models are used to establish a relationship between brain signals and features. However, these ill-equipped model highly-dynamic, complex non-linear system like they often require substantial amount subject-specific training data. This work introduces novel decoder architecture: Very Large Augmented Auditory Inference (VLAAI) network. The VLAAI network outperformed state-of-the-art subject-independent (median...

10.1038/s41598-022-27332-2 article EN cc-by Scientific Reports 2023-01-16

Researchers investigating the neural mechanisms underlying speech perception often employ electroencephalography (EEG) to record brain activity while participants listen spoken language. The high temporal resolution of EEG enables study responses fast and dynamic signals. Previous studies have successfully extracted characteristics from data and, conversely, predicted features. Machine learning techniques are generally employed construct encoding decoding models, which necessitate a...

10.3390/data9080094 article EN cc-by Data 2024-07-26

An effective way to increase the noise robustness of automatic speech recognition is label noisy features as either reliable or unreliable (missing), and replace (impute) missing ones by clean estimates. Conventional imputation techniques employ parametric models impute on a frame-by-frame basis. At low signal-to-noise ratios (SNRs), these fail, because too many time frames may contain few, if any, features. In this paper, we introduce novel non-parametric, exemplar-based method for...

10.1109/jstsp.2009.2039171 article EN IEEE Journal of Selected Topics in Signal Processing 2010-03-01

We present a novel, exemplar-based method for audio event detection based on non-negative matrix factorisation. Building recent work in noise robust automatic speech recognition, we model events as linear combination of dictionary atoms, and mixtures overlapping events. The weights activated atoms an observation serve directly evidence the underlying classes. span multiple frames are created by extracting all possible fixed-length exemplars from training data. To combat data scarcity small...

10.1109/waspaa.2013.6701847 article EN 2013-10-01

In this paper, three utterance modelling approaches, namely Gaussian Mean Supervector (GMS), i-vector and Posterior Probability (GPPS), are applied to the accent recognition problem.For each modeling method, different classifiers, Support Vector Machine (SVM), Naive Bayesian Classifier (NBC) Sparse Representation (SRC), employed find out suitable matches between schemes classifiers.The evaluation database is formed by using English utterances of speakers whose native languages Russian,...

10.1109/icassp.2013.6639089 article EN IEEE International Conference on Acoustics Speech and Signal Processing 2013-05-01

10.1016/j.engappai.2014.05.003 article EN Engineering Applications of Artificial Intelligence 2014-06-07

Unseen noise estimation is a key yet challenging step to make speech enhancement algorithm work in adverse environments. At worst, the only prior knowledge we know about encountered that it different from involved speech. Therefore, by subtracting components which cannot be adequately represented well defined model, noises can estimated and removed. Given good performance of deep learning signal representation, auto encoder (DAE) employed this for accurately modeling clean spectrum. In...

10.1109/taslp.2015.2498101 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2015-11-05

In this paper, a bottom-up, activation-based paradigm for continuous speech recognition is described. Speech described by co-occurrence statistics of acoustic events over an analysis window variable length, leading to vectorial representation high but fixed dimension called “Histogram Acoustic Co-occurrence” (HAC). During training, recurring patterns are discovered and associated words through non-negative matrix factorisation. testing, word activations computed from the HACrepresentation...

10.21437/interspeech.2008-633 article EN Interspeech 2022 2008-09-22

Motivated by the success of i-vectors in field speaker recognition, this paper proposes a new approach for age estimation from telephone speech patterns based on i-vectors.In method, each utterance is modeled its corresponding ivector.Then, Support Vector Regression (SVR) applied to estimate speakers.The proposed method trained and tested conversations National Institute Standard Technology (NIST) 2010 2008 Speaker Recognition Evaluations databases.Evaluation results show that outperforms...

10.21437/interspeech.2012-169 article EN Interspeech 2022 2012-09-09

Modeling the relationship between natural speech and a recorded electroencephalogram (EEG) helps us understand how brain processes has various applications in neuroscience brain-computer interfaces. In this context, so far mainly linear models have been used. However, decoding performance of model is limited due to complex highly non-linear nature auditory processing human brain. We present novel Long Short-Term Memory (LSTM)-based architecture as nonlinear for classification problem whether...

10.1109/icassp40776.2020.9054000 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020-04-09

Learning a set of tasks in sequence remains challenge for artificial neural networks, which, such scenarios, tend to suffer from Catastrophic Forgetting (CF). The same applies End-to-End (E2E) Automatic Speech Recognition (ASR) models, even monolingual tasks. In this paper, we aim overcome CF E2E ASR by inserting adapters, small architectures few parameters which allow general model be fine-tuned specific task, into our model. We make these adapters task-specific, while regularizing the...

10.1109/icassp49357.2023.10095837 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023-05-05

Abstract INTRODUCTION The automated analysis of connected speech using natural language processing (NLP) emerges as a possible biomarker for Alzheimer's disease (AD). However, it remains unclear which types are most sensitive and specific the detection AD. METHODS We applied model to automatically transcribed from 114 Flemish‐speaking individuals first distinguish early AD patients amyloid negative cognitively unimpaired (CU) then positive CU five different speech. RESULTS was able between...

10.1002/alz.14530 article EN cc-by-nc-nd Alzheimer s & Dementia 2025-01-27

The recent advancement of speech recognition technology has been driven by large-scale datasets and attention-based architectures, but many challenges still remain, especially for low-resource languages dialects. This paper explores the integration weakly supervised transcripts from TV subtitles into automatic (ASR) systems, aiming to improve both verbatim transcriptions automatically generated subtitles. To this end, data are regarded as different domains or languages, due their distinct...

10.48550/arxiv.2502.03212 preprint EN arXiv (Cornell University) 2025-02-05

Automatic speech recognition (ASR) systems often struggle to recognize from individuals with dysarthria, a disorder neuromuscular causes, accuracy declining further for unseen speakers and content. Achieving robustness such situations requires ASR address speaker-independent vocabulary-mismatched scenarios, minimizing user adaptation effort. This study focuses on comprehensive training strategies methods tackle these challenges, leveraging the transformer-based Wav2Vec2.0 model. Unlike prior...

10.3390/app15042006 article EN cc-by Applied Sciences 2025-02-14

10.1109/icassp49660.2025.10887592 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025-03-12

We present a Character-Word Long Short-Term Memory Language Model which both reduces the perplexity with respect to baseline word-level language model and number of parameters model. Character information can reveal structural (dis)similarities between words even be used when word is out-of-vocabulary, thus improving modeling infrequent unknown words. By concatenating character embeddings, we achieve up 2.77% relative improvement on English compared similar amount 4.57% Dutch. Moreover, also...

10.18653/v1/e17-1040 article EN cc-by 2017-01-01
Coming Soon ...