- Speech and Audio Processing
- Speech Recognition and Synthesis
- Music and Audio Processing
- Advanced Adaptive Filtering Techniques
- Advanced Steganography and Watermarking Techniques
- Digital Media Forensic Detection
- Hand Gesture Recognition Systems
- Hearing Loss and Rehabilitation
- Chaos-based Image/Signal Encryption
- Video Coding and Compression Technologies
- Advanced Data Compression Techniques
- Advanced Software Engineering Methodologies
- Human Pose and Action Recognition
- Multimedia Communication and Technology
- Software Engineering Research
- Advanced Vision and Imaging
- Spectroscopy and Laser Applications
- Image and Object Detection Techniques
- Atmospheric chemistry and aerosols
- Wireless Communication Networks Research
- Natural Language Processing Techniques
- Manufacturing Process and Optimization
- Infant Health and Development
- Medical Image Segmentation Techniques
- Model-Driven Software Engineering Techniques
Computer Research Institute of Montréal
2011-2024
United States Military Academy
2024
Auburn University
2021
King Khalid University
2019-2020
University of Louisville
2015-2018
Powerlink Queensland (Australia)
2016
UNSW Sydney
2012-2015
University of Canberra
2013-2015
Deakin University
2014
UNSW Canberra
2012-2013
In this paper, we describe recent progress in i-vector based speaker verification. The use of universal background models (UBM) with full-covariance matrices is suggested and thoroughly experimentally tested. i-vectors are scored using a simple cosine distance advanced techniques such as Probabilistic Linear Discriminant Analysis (PLDA) heavy-tailed variant PLDA (PLDA-HT). Finally, investigate into dimensionality reduction before entering the PLDA-HT modeling. results very competitive: on...
Illegal distribution of a digital movie is significant threat to the film industries. With advent high-speed broadband Internet access, pirated copy video can be easily distributed global audience. Digital watermarking possible means limiting this type distribution. In existing methods, watermark usually embedded into luminance channel frame, which affects imperceptibility. addition, none techniques are robust combination commonly used attacks, such as compression, upscaling, rotation,...
The popularity of 3D video is increasing daily due to the availability low-cost televisions and high-speed Internet access. However, currently contents can be distributed illegally without any protection. For views generated using a depth-image-based rendering technique, not only left right as content, but also center, left, or individually 2D content. As digital watermarking possible way protecting these from unauthorized distribution, in this paper, we propose method for rendered video. In...
The automatic speaker verification spoofing and countermeasures challenge 2015 provides a common framework for the evaluation of or anti-spoofing techniques in presence various seen unseen attacks. This contribution proposes system consisting amplitude, phase, linear prediction residual, combined amplitude - phase-based detection In this task we use following features: Mel-frequency cepstral coefficients (MFCC), product spectrum-based coefficients, modified group delay weighted residual...
Hand segmentation is often the first step in applications such as gesture recognition, hand tracking and recognition. We propose a new technique for of color images using adaptive skin model. Our method captures pixel values person's converts them into YCbCr space. The will then map CbCr space to plane construct clustered region person. Edge detection applied cluster order create an boundaries classification. Experimental results demonstrate successful over variety variations color,...
We discuss the limitations of i-vector representation speech segments in speaker recognition and explain how Joint Factor Analysis (JFA) can serve as an alternative feature extractor a variety ways. Building on work Zhao Dong, we implemented variational Bayes treatment JFA which accommodates adaptation universal background models (UBMs) natural way. This allows us to experiment with several types features for recognition: factors diagonal addition i-vectors, extracted without UBM each case....
The piracy of a digital movie is significant problem for studios and producers but can be prevented by video watermarking. In existing watermarking algorithms, robustness to several attacks on the watermark has been improved. However, none these techniques are robust combination common geometric distortions scaling, rotation, cropping downscaling in resolution with other such as compression. this paper, blind algorithm proposed where embedded singular values dual-tree complex wavelet...
The amount of unauthorized distribution 3D video is increasing day by due to the availability high speed Internet and low cost TV. Note that, not only both left right views generated using depth-image-based rendering can be distributed as content but also centre, or view individually 2D content. Video watermarking a possible way protect this type illegal distribution. In paper, we propose digital method for rendered each left, view. method, watermark embedded into centre dual-tree complex...
Due to the increasing use of fusion in speaker recognition systems, one trend current research activity focuses on new features that capture complementary information MFCC (Mel-frequency cepstral coefficients) for improving performance. The goal this work is combine (or fuse) amplitude and phase-based improve verification Based phase spectra we investigate some possible variations extraction coefficients produce diversity with respect fused subsystems. Among amplitude-based consider widely...
We reformulate joint factor analysis so that it can serve as a feature extractor for text-dependent speaker recognition. The new formulation is based on left-to-right modeling with tied mixture HMMs and designed to deal problems such the inadequacy of subspace methods in speaker-phrase variability, UBM mismatches arise result variable phonetic content, need exploit text-independent resources pass features extracted by trainable backend which plays role analogous PLDA i-vector/PLDA cascade...
Accurate hand segmentation is a challenging task in computer vision applications. We propose new method to segment based on free-form skin color model. The pixel value of person's captured and represented YCbCr CbCr space mapped plane order produce clustered region color. Then, instead using ellipse model the color, edge detection performed construct result, tested various complex backgrounds gives promising results.
Unauthorized redistribution of a movie is common threat to digital media that can be prevented by video watermarking. The watermark commonly embedded into the luminance (Y) component frame. chrominance (U) supports more distortion than Y without being perceived human eyes. Thus, in our proposed approach, U each frame sequence using dual-tree complex wavelet transform (DT CWT). This approach aims provide perceptually invisible high quality watermarked video. detection performed original...
The REVERB challenge provides a common framework for the evaluation of feature extraction techniques in presence both reverberation and additive background noise. State-of-the-art speech recognition systems perform well controlled environments, but their performance degrades realistic acoustical conditions, especially real as simulated reverberant environments. In this contribution, we utilize multiple extractors including conventional mel-filterbank, multi-taper spectrum estimation-based...
Experimental results and the latest standards have proved that segmentation based video coding systems can outperform traditional block-based systems. However, this approach requires simultaneous estimation of both shape motion moving objects in a scene. In most cases neither nor are known initially. Another critical aspect tightly-coupled relationship is inaccurate may cause poor erroneous negatively impact estimation. While some existing approaches require user intervention use clues such...
The drug discovery process increasingly relies on high-throughput sample analysis to accelerate the identification of viable candidates. Recently, chromatographic-free mass spectrometry (HT-MS) technologies have emerged, significantly increasing readout speed and enabling large sets. These HT-MS platforms continuously acquire data from various samples into a single file, presenting challenges in applying distinctive acquisition methods specific samples. This study introduces novel approach...
Experimental results and the latest standards have proved video coding systems with ability to adapt size shape of motion estimation area objects in scene can outperform traditional block-based systems. In this paper, a segmentation-based strategy that employs bi-directional hints for interframe prediction is proposed. The appealing thing about they are continuous invertible, even though observed field frame will be discontinuous non-invertible. proposed scheme outperforms rate-distortion...
The goal of speech emotion recognition (SER) is to identify the emotional or physical state a human being from his her voice. One most important things in SER task extract and select relevant features with which emotions could be recognized. In this paper, we present smoothed nonlinear energy operator (SNEO)-based amplitude modulation cepstral coefficients (AMCC) feature for recognizing signals. SNEO estimates required produce AM-FM signal, then estimated separated into its frequency...
Piracy of a digital movie is significant threat for studios and producers. Digital video watermarking an important technique that can be used to protect the content. In existing algorithms, robustness several attacks watermark has been improved. However, none techniques are robust combination common geometric distortions scaling, rotation, cropping with other attacks. this paper, we propose blind algorithm where embedded into both chrominance channels using dual-tree complex wavelet...
Due to the availability of high speed online streaming sites, a pirated copy digital video can be easily distributed global audience. This paper proposes watermarking technique based on dual-tree complex wavelet transform that protect this content. In scheme, watermark is embedded into chrominance channel frames provide quality watermarked video. The detectable without reference content as well original which makes method robust temporal synchronization attacks such frame dropping and rate...
This paper presents robust feature extractors for a continuous speech recognition task in matched and mismatched environments. The conditions may occur due to additive noise, different channel, acoustic reverberation. In the conventional Mel-frequency cepstral coefficient (MFCC) extraction framework, subband spectrum enhancement technique is incorporated improve its robustness. We denote this front-end as MFCCs (RMFCC). Based on gammatone compressive gammachirp filter-banks, filterbank...