Novel Long-Term Information Based Language Identification
0202 electrical engineering, electronic engineering, information engineering
02 engineering and technology
DOI:
10.4028/www.scientific.net/amr.655-657.1805
Publication Date:
2013-03-11T17:07:18Z
AUTHORS (4)
ABSTRACT
A novel long-term information feature for language identification called shifted cepstra curve (SCC) is presented in this paper. Long-term information consists of information over multiple frames, which are commonly used in language identification systems. For instance, in parallel phone recognition language model (PPRLM), the feature vector contains not only information surpassing multiple frames but also linguistic knowledge [1]. However high computational cost for modeling linguistic gram may preclude their use in tasks which demand low memory. By contrast, experiments have proved that linguistics independent long-term information can also achieve high performance in language identification with much lower computing cost. For instance, Shifted delta cepstra (SDC) is employed as long-term information features to establish GMM based language identification systems, which can achieve comparative or even superior performance to PPRLM [2-4]. While, though SDC feature could model the dynamical information across multiple frames, the approximation is not precise enough for language identification. Thus, this paper further presents a new method for linguistics independent long-term information extraction by curve fitting algorithm. Experiments show that new feature based language identification systems are superior to not only short-term features based but also SDC features based systems.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (8)
CITATIONS (0)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....