- Speech Recognition and Synthesis
- Speech and Audio Processing
- Natural Language Processing Techniques
- Speech and dialogue systems
- Music and Audio Processing
- Topic Modeling
- Infant Health and Development
- Advanced Data Compression Techniques
- Muscle activation and electromyography studies
- Hand Gesture Recognition Systems
- Multi-Agent Systems and Negotiation
- Silicon Carbide Semiconductor Technologies
- Advanced Adaptive Filtering Techniques
- Digital Media Forensic Detection
- VLSI and Analog Circuit Testing
- Blind Source Separation Techniques
- Algorithms and Data Compression
- Neural Networks and Applications
- Advanced Malware Detection Techniques
- Allergic Rhinitis and Sensitization
- EEG and Brain-Computer Interfaces
- Network Security and Intrusion Detection
- Advanced Chemical Sensor Technologies
- Advancements in Semiconductor Devices and Circuit Design
- Parallel Computing and Optimization Techniques
Universitatea Națională de Știință și Tehnologie Politehnica București
2016-2025
Institute of Electronics
2019
Information Technology University
2010
Studies have shown that newborns are crying differently depending on their need: hunger, tiredness, discomfort, eructation, pain, and so on. Skilled persons such as pediatricians can distinguish between different types of newborn cries consequently estimate the baby's need by using sounds gestures produced baby. However, this is a real problem for unskilled parents who would like to answer promptly baby needs. Recently, research work has been invested into developing automated methods detect...
Kaldi NNET3 is at the moment leading speech recognition toolkit on many well-known tasks such as LibriSpeech, TED-LIUM or TIMIT. Several versions of time-delay neural network (TDNN) architecture were recently proposed, implemented and evaluated for acoustic modeling with Kaldi: plain TDNN, convolutional TDNN (CNN-TDNN), long short-term memory (TDNN-LSTM) TDNN-LSTM attention. To best our knowledge, this first paper to describe in-depth these various flavors providing details regarding...
A natural evolution of applications that analyze speech is to improve their robustness multi-speaker environments. Humans use selective auditory attention and can easily switch focus from one source another even when listening a single channel recording with overlapped speech. The same brain feature allows us detect the number simultaneously active sources. In order quantify human level performance for this task we have designed perception study evaluates participants' ability accurately...
Hand gesture recognition has numerous applications in medical (e.g., prosthetics), engineering robot manipulation) and, even, military research areas UAV control applications). This paper proposes a fast and accurate method to identify hand categories based on electromyo-graphic (EMG) signals registered by commercial sensor Myo Armband developed Ontario-based Thalmic Labs), which is placed the user's forearm. The proposed extraction of time-domain features neural network architecture perform...
A new general method for calculation of thin-film-distributed resistive structures is presented. This based on the conformal mapping analyzed domain complex upper half-plane and determination potential function as solution a boundary-values Volterra problem. Although inherently applies only to single-connected domains with known representations half-plane, it may be extended also other symmetrical multiple-connected configurations in all cases when positions conducting terminals are known....
Studies have shown that there are different types of cries depending on the newborns' need such as hunger, tiredness, discomfort and so on. Neonatologist or pediatricians can distinguish between find a pattern in each type cry. Unfortunately, this is real problem for parents who want to act fast possible comfort newborn. In paper, we propose fully automatic system attempts discriminate cries. The baby cry classification based Gaussian Mixture Models i-vectors. evaluation performed an audio...
This paper presents the main improvements brought recently to SpeeD automatic speech recognition system. Several aspects, such as and text resources acquisition, noise-robust features feature transforms are discussed. All updates in our ASR system accompanied by experimental results illustrating significant improvements: between 30% 35% relative WER reductions for various case studies (read/spontaneous speech, noisy/clean speech). In last part of paper, is also compared with Google's a brief...
This study investigates the use of machine translated text for ASR domain adaptation. The proposed methodology is applicable when domain-specific data available in language X only, whereas goal to develop a system Y. Two semi-supervised methods are introduced and compared with fully unsupervised approach, which represents baseline. While both approaches allow quickly an accurate system, overpass one by 10% 29% relative, depending on amount human post-processed available. An in-depth...
This paper presents the main improvements brought recently to large-vocabulary, continuous speech recognition (LVCSR) system for Romanian language developed by Speech and Dialogue (SpeeD) research laboratory. While most important improvement consists in use of DNN-based acoustic models, instead classic HMM-GMM approach, several other aspects are discussed paper: a significant increase training corpus, additional algorithms feature processing, speaker adaptive training, discriminative and,...
As human beings, we begin interacting with the world by expressing our basic needs through crying. Parents strive to identify and timely address these before hysterical crying sets in. However, first-time parents usually fail, this leads frustration feelings of helplessness. In context, work focuses on creating an automatic system able distinguish between different infant based We extract various paralinguistic features from baby-cry audio signals train rule-based or statistical classifiers....
Lightly supervised acoustic modeling in under-resourced languages raises new issues due to the poor accuracy of Automatic Speech Recognition (ASR) systems for such and quality speech transcriptions that may be found. In these conditions, common alignment techniques are not always capable aligning ASR output approximate transcription. We propose two methods overcome issues. first approach we apply an image processing algorithm on matching matrix texts aligned, while second is based segmental...
Pollen allergies are a cause of much suffering for an increasing number individuals. Current pollen monitoring techniques lacking due to their reliance on manual counting and classification by human technicians. In this study, we present neural network architecture capable distinguishing species using data from automated particle measurement device. This work presents improvement over the current state art in task classification, fluorescence spectrum aerosol particles. We obtained relative...
Technological evolution in the remote sensing domain has allowed acquisition of large archives satellite image time series (SITS) for Earth Observation. In this context, need to interpret Observation is continuously increasing and extraction information from these become difficult without adequate tools. paper, we propose a fast effective two-step technique retrieval spatio-temporal patterns that are similar given query. The method based on query-by-example procedure whose inputs provided by...