Vassilios Digalakis

ORCID: 0000-0002-1255-8939
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Speech Recognition and Synthesis
  • Speech and Audio Processing
  • Music and Audio Processing
  • Natural Language Processing Techniques
  • Speech and dialogue systems
  • Neural Networks and Applications
  • Advanced Data Compression Techniques
  • Bayesian Methods and Mixture Models
  • Medical Image Segmentation Techniques
  • Target Tracking and Data Fusion in Sensor Networks
  • Topic Modeling
  • Algorithms and Data Compression
  • Blind Source Separation Techniques
  • Cerebrovascular and Carotid Artery Diseases
  • Phonetics and Phonology Research
  • Privacy-Preserving Technologies in Data
  • Stochastic Gradient Optimization Techniques
  • Advanced Image Processing Techniques
  • Time Series Analysis and Forecasting
  • Bayesian Modeling and Causal Inference
  • Flow Measurement and Analysis
  • Embedded Systems Design Techniques
  • Heat Transfer and Numerical Methods
  • Advanced Vision and Imaging
  • Control Systems and Identification

Technical University of Crete
2000-2016

First Technical University
2006

Boston University
1991-2005

SRI International
1992-2003

University of Geneva
2003

Menlo School
1993-2002

University of Crete
1999-2002

Many alternative models have been proposed to address some of the shortcomings hidden Markov model (HMM), which is currently most popular approach speech recognition. In particular, a variety that could be broadly classified as segment described for representing variable-length sequence observation vectors in recognition applications. Since there are many aspects common between these approaches, including general and training problems, it useful consider them unified framework. The paper...

10.1109/89.536930 article EN IEEE Transactions on Speech and Audio Processing 1996-01-01

A trend in automatic speech recognition systems is the use of continuous mixture-density hidden Markov models (HMMs). Despite good performance that these achieve on average large vocabulary applications, there a variability across speakers. Performance degrades dramatically when user radically different from training population. popular technique can improve and robustness system adapting to speaker, more generally channel task. In HMMs number component densities typically very large, it may...

10.1109/89.466659 article EN IEEE Transactions on Speech and Audio Processing 1995-01-01

A nontraditional approach to the problem of estimating parameters a stochastic linear system is presented. The method based on expectation-maximization algorithm and can be considered as continuous analog Baum-Welch estimation for hidden Markov models. used training dynamical model that proposed better representing spectral dynamics speech recognition. It assumed observed feature vectors phone segment are output system, it shown how evolution function length modeled using alternative...

10.1109/89.242489 article EN IEEE Transactions on Speech and Audio Processing 1993-01-01

An algorithm is proposed that achieves a good tradeoff between modeling resolution and robustness by using new, general scheme for tying of mixture components in continuous mixture-density hidden Markov model (HMM)-based speech recognizers. The sets HMM states share the same are determined automatically agglomerative clustering techniques. Experimental results on ARPA's Wall Street Journal corpus show this reduces errors 25% over typical tied-mixture systems. New fast algorithms computing...

10.1109/89.506931 article EN IEEE Transactions on Speech and Audio Processing 1996-07-01

The authors describe a technique called progressive search which is useful for developing and implementing speech recognition systems with high computational requirements. scheme iteratively uses more complex schemes, where each iteration constrains the space of next. An algorithm forward-backward word-life described. It can generate word lattice in that would be used as language model embedded succeeding pass to reduce computation shown speed-ups than an order magnitude are achievable only...

10.1109/icassp.1993.319301 article EN IEEE International Conference on Acoustics Speech and Signal Processing 1993-01-01

Adapting the parameters of a statistical speaker independent continuous-speech recognizer to and channel can significantly improve recognition performance robustness system. In continuous mixture-density hidden Markov models number component densities is typically very large, it may not be feasible acquire sufficient amount adaptation data for robust maximum-likelihood estimates. To solve this problem, we have recently proposed constrained estimation technique Gaussian mixture densities....

10.1109/89.506933 article EN IEEE Transactions on Speech and Audio Processing 1996-07-01

We examine alternative architectures for a client-server model of speech-enabled applications over the World Wide Web (WWW). compare server-only processing where client encodes and transmits speech signal to server, recognition front end runs locally at cepstral coefficients server Internet. follow novel encoding paradigm, trying maximize performance instead perceptual reproduction, we find that by transmitting can achieve significantly higher fraction bit rate required when directly....

10.1109/49.743698 article EN IEEE Journal on Selected Areas in Communications 1999-01-01

This original volume describes the Spoken Language Translator (SLT), one of first major automatic speech translation projects. The SLT system can translate between English, French, and Swedish in domain air travel planning, using a vocabulary about 1500 words, with an accuracy 75%. authors detail language processing components, largely built on top SRI Core Engine, combination general grammars techniques that allow them to be rapidly customized specific domains. They base recognition Hidden...

10.1162/089120101300346840 article EN Computational Linguistics 2001-03-01

We propose a scheme that improves the robustness of continuous HMM systems use mixture observation densities by sharing same components among different states. The sets states share are determined automatically using agglomerative clustering techniques. Experimental results on Wall-Street Journal Corpus show our new form output distributions achieves 25% reduction in error rate over typical tied-mixture systems.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML"...

10.1109/icassp.1994.389212 article EN 2002-12-17

A simple and general method is described that can combine different knowledge sources to reorder N-best lists of hypotheses produced by a speech recognizer. The automatically trainable, acquiring information from both positive negative examples. In experiments, the was tested on 1000-utterance sample unseen ATIS data.

10.3115/1075812.1075858 article EN 1994-01-01

An dynamical system model is proposed for better representing the spectral dynamics of speech recognition. It assumed that observed feature vectors a phone segment are output stochastic linear system, and two alternative assumptions regarding relationship length evolution considered. Training equivalent to identification nontraditional approach based on estimate-maximize algorithm followed. This evaluated phoneme classification task using TIMIT database. shown performance obtained...

10.1109/icassp.1991.150334 article EN 1991-01-01

Methods for reducing the computation requirements of joint segmentation and recognition phones using stochastic segment model are presented. The approach uses a fast classification method that reduces by factor two to four, depending on confidence choosing most probable model. A split-and-merge algorithm is proposed as an alternative typical dynamic programming solution problem, with savings increasing proportionally complexity. Although current recognizer context-independent phone models,...

10.1109/78.175733 article EN IEEE Transactions on Signal Processing 1992-01-01

The mismatch that frequently occurs between the training and testing conditions of an automatic speech recognizer can be efficiently reduced by adapting parameters to conditions. Two measures characterize performance adaptation algorithm are speed with which it adapts new conditions, its computational complexity, is important for online applications. A family algorithms continuous-density hidden Markov model (HMM) based recognizers have appeared on constrained reestimation distribution...

10.1109/89.759031 article EN IEEE Transactions on Speech and Audio Processing 1999-05-01

The recognition accuracy in previous large vocabulary automatic speech (ASR) systems is highly related to the existing mismatch between training and testing sets. For example, dialect differences across speakers result a significant degradation performance. Some popular adaptation approaches improve performance of recognizers based on hidden Markov models with continuous mixture densities by using linear transformations adapt means, possibly covariances Gaussians. assumption, however, too...

10.1109/89.748122 article EN IEEE Transactions on Speech and Audio Processing 1999-03-01

Several adaptation approaches have been proposed in an effort to improve the speech recognition performance mismatched conditions. However, application of these had mostly constrained speaker or channel tasks. We first investigate effect dialects between training and testing speakers automatic (ASR) system. find that a mismatch significantly influences accuracy. Consequently, we apply several develop dialect-specific system using dialect-dependent trained on different dialect small number...

10.1109/icassp.1997.596223 article EN IEEE International Conference on Acoustics Speech and Signal Processing 2002-11-22

The performance and robustness of a speech recognition system can be improved by adapting the models to speaker, channel task. In continuous mixture-density hidden Markov number component densities is typically very large, it may not feasible acquire large amount adaptation data for robust maximum-likelihood estimates. To solve this problem, we propose constrained estimation technique Gaussian mixture densities, combine with Bayesian techniques improve its asymptotic properties. We evaluate...

10.1109/icassp.1995.479785 article EN International Conference on Acoustics, Speech, and Signal Processing 2002-11-19

This paper summarizes the work of "Rapid Speech Recognizer Adaptation" team in workshop held at Johns Hopkins University summer 1998. The project addressed modeling dependencies between units speech with goal making more effective use small amounts data for speaker adaptation. A variety methods were investigated and their effectiveness a rapid adaptation task defined on Switchboard conversational corpus is reported.

10.1109/icassp.1999.759781 article EN 1999-01-01
Coming Soon ...