- Speech and Audio Processing
- Music and Audio Processing
- Emotion and Mood Recognition
- Speech and dialogue systems
- Speech Recognition and Synthesis
- Handwritten Text Recognition Techniques
- Hand Gesture Recognition Systems
- Natural Language Processing Techniques
- Human Pose and Action Recognition
- Semiconductor Lasers and Optical Devices
- Video Analysis and Summarization
- Music Technology and Sound Studies
- Mathematics, Computing, and Information Processing
- Radio Frequency Integrated Circuit Design
- Advancements in PLL and VCO Technologies
- Infant Health and Development
- Usability and User Interface Design
- Gaze Tracking and Assistive Technology
- Gait Recognition and Analysis
- Advanced Photonic Communication Systems
- Interactive and Immersive Displays
- Photonic and Optical Devices
- Image Retrieval and Classification Techniques
- Face and Expression Recognition
- Augmented Reality Applications
New York Hospital Queens
2024
NewYork–Presbyterian Hospital
2024
Friedrich-Alexander-Universität Erlangen-Nürnberg
2022
Technical University of Munich
1992-2009
University of Toronto
2005
Fraunhofer Institute for Applied Solid State Physics
1995-2004
Ludwig-Maximilians-Universität München
2002-2003
Fraunhofer Society
1999
Siemens (Germany)
1988
In this contribution we introduce speech emotion recognition by use of continuous hidden Markov models. Two methods are propagated and compared throughout the paper. Within first method a global statistics framework an utterance is classified Gaussian mixture models using derived features raw pitch energy contour signal. A second introduces increased temporal complexity applying considering several states low-level instantaneous instead statistics. The paper addresses design working engines...
In this paper we introduce a novel approach to the combination of acoustic features and language information for most robust automatic recognition speaker's emotion. Seven discrete emotional states are classified throughout work. Firstly model emotion by is presented. The derived signal-, pitch-, energy, spectral contours ranked their quantitative contribution estimation an Several different classification methods including linear classifiers, Gaussian mixture models, neural nets, support...
We introduce speech emotion recognition by use of continuous hidden Markov models. Two methods are propagated and compared. In the first method, a global statistics framework an utterance is classified Gaussian mixture models using derived features raw pitch energy contour signal. A second method introduces increased temporal complexity, applying considering several states low-level instantaneous instead statistics. The paper addresses design working engines, results achieved with respect to...
Emotion recognition grows to an important factor in future media retrieval and man machine interfaces. However, even human deciders often experience problems realizing one's emotion, especially of strangers. In this work we strive recognize emotion independent the person concentrating on speech channel. Single feature relevance acoustic features is a critical point, which address by filter-based gain ratio calculation starting at basis 276 features. As optimization minimum set as whole...
We suggest a novel approach to affect recognition based on acoustic and linguistic analysis of spoken utterances. In order achieve maximum discrimination power within robust integration these information sources, fusion the feature level is introduced. Considering classification, we use meta-classifiers, such as StackingC Boosting, for stabilized performance, combination classifiers ensembles. Extensive comparisons diverse base-classifiers, including support vector machines, neural networks,...
An efficient on-line recognition system for symbols within handwritten mathematical expressions is proposed. The based on the generation of a symbol hypotheses net and classification elements net. final done by calculating most probable path through under regard stroke group probabilities obtained recognizer hidden Markov models.
This paper discusses innovative techniques to automatically estimate a user's emotional state analyzing the speech signal and haptical interaction on touch-screen or via mouse. The knowledge of emotion permits adaptive strategies striving for more natural robust interaction. We classify seven states: surprise, joy, anger, fear, disgust, sadness, neutral user state. is extracted by parallel stochastic analysis his spoken machine interactions while understanding desired intention. introduced...
An efficient system for structural analysis of handwritten mathematical expressions is proposed. To handle the problems caused by handwriting, this based on a soft-decision approach. This means that alternatives solution are generated during process if relation between two symbols within expression ambiguous. Finally string containing information and syntactical verified each alternative. Strings failing verification considered as invalid.
The integration of more and functionality into the human machine interface (HMI) vehicles increases complexity device handling. Thus optimal use different sensory channels is an approach to simplify interaction with in-car devices. This way user convenience as much distraction may decrease. In this paper a video based real-time hand gesture recognition system for presented. It was developed in course extensive usability studies. combination optimized HMI it allows intuitive effective...
A new and general stochastic approach to find identify dynamic gestures in continuous video streams is presented. Hidden Markov models (HMMs) are used solve this combined problem of temporal segmentation classification an integral way. Basically, improved normalized Viterbi algorithm allows one continuously observe the output scores HMMs at every time step. Characteristic peaks respective indicate presence gestures. Our experiments domain hand gesture spotting provided excellent recognition...
Electrical polarisation mode dispersion (PMD) and receiver bandwidth generated intersymbol interference (ISI) mitigation using an analogue decision feedback loop for 10 Gbit/s NRZ signals is demonstrated. ISI caused by first-order PMD of up to 120 ps differential group delay was equalised. Error free recovery with completely closed eye diagrams achieved.
A soft-decision approach for symbol segmentation within on-line sampled handwritten mathematical expressions is presented. Based on stroke-specific features as well geometrical between the strokes a hypotheses net generated. For assistance additional knowledge obtained by prerecognition stage used. The results achieved and experiments indicate performance of our approach.
A key requirement for the correct interpretation of high-resolution X-ray spectra is that transition energies are known with high accuracy and precision. We investigate K-shell features Ne, CO$_2$, SF$_6$ gases, by measuring their photo ion-yield at BESSY II synchrotron facility simultaneously 1s-np fluorescence emission He-like ions produced in Polar-X EBIT. Accurate ab initio calculations transitions these provide basis calibration. While CO$_2$ result agrees well previous measurements,...
This paper is concerned with the symbol segmentation and recognition task in context of online sampled handwritten mathematical expressions, first processing stage an overall system for understanding arithmetic formulas. Within our a statistical approach used tolerating ambiguities within decision stages resolving them either automatically by additional knowledge acquired following or interaction user. The results obtained different writers expressions demonstrate performance approach.
A novel optoelectronic receiver chip for a data rate of 2.5 Gbit/s has been developed and tested. It integrates metal-semiconductor-metal photodiode with GaAs HEMT transimpedance amplifier, high gain amplifier limiting output buffer which is able to drive 50 Ω load. special feature the that it comprises very large 300 µm diameter, eliminating need expensive fibre alignment. Measurements reveal achieves required sensitivity –15.7 dBm at bit error 10-9.
In this work we strive to find an optimal set of acoustic features for the discrimination speech, monophonic singing, and polyphonic music robustly segment media streams annotation interaction purposes. Furthermore introduce ensemble-based classification approaches within task. From a basis 276 attributes select most efficient by SVM SFFS. Additionally relevance single calculation information gain ratio is presented. As comparison reduce dimensionality PCA. We show extensive analysis...
Continuous hand gesture recognition requires the detection of gestures in a video stream and their classification. In this paper two continuous solutions using hidden Markov models (HMMs) are compared. The first approach uses motion algorithm to isolate candidates followed by HMM step. second is single-stage, HMM-based spotting method improved new implicit duration modeling. Both strategies have been tested on data containing 41 different types embedded random motion. has derived from...
Music retrieval methods are in the focus of recent interest due to increasing size music databases as e.g. Internet. Among different query content-based media analyzing intrinsic characteristics source seems form most intuitive access. The key-melody a song can be regarded major characteristic and leads by humming or singing. In this paper we turn our attention both, features algorithm matching audio retrieval. Nowadays approaches propagate use dynamic time warping for process. As reference...
We introduce a novel approach to human emotion recognition, based on manual computer interaction. The presented methods rely conventional graphical input devices. Firstly, standard mouse as used desktop PCs, and, secondly, the interaction with touch-screens or -pads in public information terminals, palm-top devices tablet PCs is considered. Additionally, gain of integration touch pressure evaluated. Four discrete emotional states are classified: irritation, annoyance, reflectiveness, and...
We present a novel multi-modal access to large MP3 music databases. Retrieval can be fulfilled either in content-based manner or by keywords. As input modalities, speech natural language utterances singing, and manual interaction handwriting, typing hardkeys are used. In order achieve especially robust retrieval results automatically suggest the user, contextual knowledge of time, date, season, user emotion, listening habits is integrated process. The system communicates with visual...