NFDI4DS | UHH-SEMS - Publication Details

Vassilios Digalakis

ORCID: 0000-0002-1255-8939

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5030929323

Research Areas

Speech Recognition and Synthesis
Speech and Audio Processing
Music and Audio Processing
Natural Language Processing Techniques
Speech and dialogue systems
Neural Networks and Applications
Advanced Data Compression Techniques
Bayesian Methods and Mixture Models
Medical Image Segmentation Techniques
Target Tracking and Data Fusion in Sensor Networks
Topic Modeling
Algorithms and Data Compression
Blind Source Separation Techniques
Cerebrovascular and Carotid Artery Diseases
Phonetics and Phonology Research
Privacy-Preserving Technologies in Data
Stochastic Gradient Optimization Techniques
Advanced Image Processing Techniques
Time Series Analysis and Forecasting
Bayesian Modeling and Causal Inference
Flow Measurement and Analysis
Embedded Systems Design Techniques
Heat Transfer and Numerical Methods
Advanced Vision and Imaging
Control Systems and Identification

Technical University of Crete
2000-2016

First Technical University
2006

Boston University
1991-2005

SRI International
1992-2003

University of Geneva
2003

Menlo School
1993-2002

University of Crete
1999-2002

From HMM's to segment models: a unified view of stochastic modeling for speech recognition

OPENALEX - Publications

Mari Ostendorf Vassilios Digalakis Owen Kimball

Many alternative models have been proposed to address some of the shortcomings hidden Markov model (HMM), which is currently most popular approach speech recognition. In particular, a variety that could be broadly classified as segment described for representing variable-length sequence observation vectors in recognition applications. Since there are many aspects common between these approaches, including general and training problems, it useful consider them unified framework. The paper...

10.1109/89.536930 article EN IEEE Transactions on Speech and Audio Processing 1996-01-01

Speaker adaptation using constrained estimation of Gaussian mixtures

OPENALEX - Publications

Vassilios Digalakis Dimitry Rtischev Leonardo Neumeyer

A trend in automatic speech recognition systems is the use of continuous mixture-density hidden Markov models (HMMs). Despite good performance that these achieve on average large vocabulary applications, there a variability across speakers. Performance degrades dramatically when user radically different from training population. popular technique can improve and robustness system adapting to speaker, more generally channel task. In HMMs number component densities typically very large, it may...

10.1109/89.466659 article EN IEEE Transactions on Speech and Audio Processing 1995-01-01

ML estimation of a stochastic linear system with the EM algorithm and its application to speech recognition

OPENALEX - Publications

Vassilios Digalakis Ján Rohlíček Mari Ostendorf

A nontraditional approach to the problem of estimating parameters a stochastic linear system is presented. The method based on expectation-maximization algorithm and can be considered as continuous analog Baum-Welch estimation for hidden Markov models. used training dynamical model that proposed better representing spectral dynamics speech recognition. It assumed observed feature vectors phone segment are output system, it shown how evolution function length modeled using alternative...

10.1109/89.242489 article EN IEEE Transactions on Speech and Audio Processing 1993-01-01

Automatic scoring of pronunciation quality

OPENALEX - Publications

Leonardo Neumeyer Horacio Franco Vassilios Digalakis M. Weintraub

10.1016/s0167-6393(99)00046-1 article EN Speech Communication 2000-02-01

Genones: generalized mixture tying in continuous hidden Markov model-based speech recognizers

OPENALEX - Publications

Vassilios Digalakis Peter Monaco Hy Murveit

An algorithm is proposed that achieves a good tradeoff between modeling resolution and robustness by using new, general scheme for tying of mixture components in continuous mixture-density hidden Markov model (HMM)-based speech recognizers. The sets HMM states share the same are determined automatically agglomerative clustering techniques. Experimental results on ARPA's Wall Street Journal corpus show this reduces errors 25% over typical tied-mixture systems. New fast algorithms computing...

10.1109/89.506931 article EN IEEE Transactions on Speech and Audio Processing 1996-07-01

Large-vocabulary dictation using SRI's DECIPHER speech recognition system: progressive search techniques

OPENALEX - Publications

Hy Murveit John Butzberger Vassilios Digalakis M. Weintraub

The authors describe a technique called progressive search which is useful for developing and implementing speech recognition systems with high computational requirements. scheme iteratively uses more complex schemes, where each iteration constrains the space of next. An algorithm forward-backward word-life described. It can generate word lattice in that would be used as language model embedded succeeding pass to reduce computation shown speed-ups than an order magnitude are achievable only...

10.1109/icassp.1993.319301 article EN IEEE International Conference on Acoustics Speech and Signal Processing 1993-01-01

Combination of machine scores for automatic grading of pronunciation quality

OPENALEX - Publications

Horacio Franco Leonardo Neumeyer Vassilios Digalakis Orith Ronen

10.1016/s0167-6393(99)00045-x article EN Speech Communication 2000-02-01

Speaker adaptation using combined transformation and Bayesian methods

OPENALEX - Publications

Vassilios Digalakis Leonardo Neumeyer

Adapting the parameters of a statistical speaker independent continuous-speech recognizer to and channel can significantly improve recognition performance robustness system. In continuous mixture-density hidden Markov models number component densities is typically very large, it may not be feasible acquire sufficient amount adaptation data for robust maximum-likelihood estimates. To solve this problem, we have recently proposed constrained estimation technique Gaussian mixture densities....

10.1109/89.506933 article EN IEEE Transactions on Speech and Audio Processing 1996-07-01

Quantization of cepstral parameters for speech recognition over the World Wide Web

OPENALEX - Publications

Vassilios Digalakis Leonardo Neumeyer Manolis Perakakis

We examine alternative architectures for a client-server model of speech-enabled applications over the World Wide Web (WWW). compare server-only processing where client encodes and transmits speech signal to server, recognition front end runs locally at cepstral coefficients server Internet. follow novel encoding paradigm, trying maximize performance instead perceptual reproduction, we find that by transmitting can achieve significantly higher fraction bit rate required when directly....

10.1109/49.743698 article EN IEEE Journal on Selected Areas in Communications 1999-01-01

The Spoken Language Translator

OPENALEX - Publications

Manny Rayner David Carter Pierrette Bouillon Vassilios Digalakis Mats Wirén

This original volume describes the Spoken Language Translator (SLT), one of first major automatic speech translation projects. The SLT system can translate between English, French, and Swedish in domain air travel planning, using a vocabulary about 1500 words, with an accuracy 75%. authors detail language processing components, largely built on top SRI Core Engine, combination general grammars techniques that allow them to be rapidly customized specific domains. They base recognition Hidden...

10.1162/089120101300346840 article EN Computational Linguistics 2001-03-01

Genones: optimizing the degree of mixture tying in a large vocabulary hidden Markov model based speech recognizer

OPENALEX - Publications

Vassilios Digalakis Hy Murveit

We propose a scheme that improves the robustness of continuous HMM systems use mixture observation densities by sharing same components among different states. The sets states share are determined automatically using agglomerative clustering techniques. Experimental results on Wall-Street Journal Corpus show our new form output distributions achieves 25% reduction in error rate over typical tied-mixture systems.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML"...

10.1109/icassp.1994.389212 article EN 2002-12-17

A comparative study of speaker adaptation techniques

OPENALEX - Publications

Leonardo Neumeyer Ananth Sankar Vassilios Digalakis

10.21437/eurospeech.1995-282 article EN 1995-09-18

Combining knowledge sources to reorder N-best speech hypothesis lists

OPENALEX - Publications

Manny Rayner David Carter Vassilios Digalakis Patti Price

A simple and general method is described that can combine different knowledge sources to reorder N-best lists of hypotheses produced by a speech recognizer. The automatically trainable, acquiring information from both positive negative examples. In experiments, the was tested on 1000-utterance sample unseen ATIS data.

10.3115/1075812.1075858 article EN 1994-01-01

Automatic pronunciation evaluation of foreign speakers using unknown text

OPENALEX - Publications

N. Moustroufas Vassilios Digalakis

10.1016/j.csl.2006.04.001 article EN Computer Speech & Language 2006-06-07

A dynamical system approach to continuous speech recognition

OPENALEX - Publications

Vassilios Digalakis Ján Rohlíček Mari Ostendorf

An dynamical system model is proposed for better representing the spectral dynamics of speech recognition. It assumed that observed feature vectors a phone segment are output stochastic linear system, and two alternative assumptions regarding relationship length evolution considered. Training equivalent to identification nontraditional approach based on estimate-maximize algorithm followed. This evaluated phoneme classification task using TIMIT database. shown performance obtained...

10.1109/icassp.1991.150334 article EN 1991-01-01

Efficient speech recognition using subvector quantization and discrete-mixture HMMS

OPENALEX - Publications

Vassilios Digalakis Stavros Tsakalidis Costas Harizakis Leonardo Neumeyer

10.1006/csla.1999.0134 article EN Computer Speech & Language 2000-01-01

Fast algorithms for phone classification and recognition using segment-based models

OPENALEX - Publications

Vassilios Digalakis Mari Ostendorf Ján Rohlíček

Methods for reducing the computation requirements of joint segmentation and recognition phones using stochastic segment model are presented. The approach uses a fast classification method that reduces by factor two to four, depending on confidence choosing most probable model. A split-and-merge algorithm is proposed as an alternative typical dynamic programming solution problem, with savings increasing proportionally complexity. Although current recognizer context-independent phone models,...

10.1109/78.175733 article EN IEEE Transactions on Signal Processing 1992-01-01

Online adaptation of hidden Markov models using incremental estimation algorithms

OPENALEX - Publications

Vassilios Digalakis

The mismatch that frequently occurs between the training and testing conditions of an automatic speech recognizer can be efficiently reduced by adapting parameters to conditions. Two measures characterize performance adaptation algorithm are speed with which it adapts new conditions, its computational complexity, is important for online applications. A family algorithms continuous-density hidden Markov model (HMM) based recognizers have appeared on constrained reestimation distribution...

10.1109/89.759031 article EN IEEE Transactions on Speech and Audio Processing 1999-05-01

Maximum-likelihood stochastic-transformation adaptation of hidden Markov models

OPENALEX - Publications

Vassilios Diakoloukas Vassilios Digalakis

The recognition accuracy in previous large vocabulary automatic speech (ASR) systems is highly related to the existing mismatch between training and testing sets. For example, dialect differences across speakers result a significant degradation performance. Some popular adaptation approaches improve performance of recognizers based on hidden Markov models with continuous mixture densities by using linear transformations adapt means, possibly covariances Gaussians. assumption, however, too...

10.1109/89.748122 article EN IEEE Transactions on Speech and Audio Processing 1999-03-01

Development of dialect-specific speech recognizers using adaptation methods

OPENALEX - Publications

Vassilios Diakoloukas Vassilios Digalakis Leonardo Neumeyer Jaan Kaja

Several adaptation approaches have been proposed in an effort to improve the speech recognition performance mismatched conditions. However, application of these had mostly constrained speaker or channel tasks. We first investigate effect dialects between training and testing speakers automatic (ASR) system. find that a mismatch significantly influences accuracy. Consequently, we apply several develop dialect-specific system using dialect-dependent trained on different dialect small number...

10.1109/icassp.1997.596223 article EN IEEE International Conference on Acoustics Speech and Signal Processing 2002-11-22

Speaker adaptation using combined transformation and Bayesian methods

OPENALEX - Publications

Vassilios Digalakis Leonardo Neumeyer

The performance and robustness of a speech recognition system can be improved by adapting the models to speaker, channel task. In continuous mixture-density hidden Markov number component densities is typically very large, it may not feasible acquire large amount adaptation data for robust maximum-likelihood estimates. To solve this problem, we propose constrained estimation technique Gaussian mixture densities, combine with Bayesian techniques improve its asymptotic properties. We evaluate...

10.1109/icassp.1995.479785 article EN International Conference on Acoustics, Speech, and Signal Processing 2002-11-19

Training data clustering for improved speech recognition

OPENALEX - Publications

Ananth Sankar Frangoise Beaufays Vassilios Digalakis

10.21437/eurospeech.1995-134 article EN 1995-09-18

Rapid speech recognizer adaptation to new speakers

OPENALEX - Publications

Vassilios Digalakis H. Grady Collier Steven A. Berkowitz Adrian Corduneanu Enrico Bocchieri and 5 more

This paper summarizes the work of "Rapid Speech Recognizer Adaptation" team in workshop held at Johns Hopkins University summer 1998. The project addressed modeling dependencies between units speech with goal making more effective use small amounts data for speaker adaptation. A variety methods were investigated and their effectiveness a rapid adaptation task defined on Switchboard conversational corpus is reported.

10.1109/icassp.1999.759781 article EN 1999-01-01

Coming Soon ...