- Speech and Audio Processing
- Blind Source Separation Techniques
- Domain Adaptation and Few-Shot Learning
- Physical Unclonable Functions (PUFs) and Hardware Security
- Privacy-Preserving Technologies in Data
- Industrial Vision Systems and Defect Detection
- Speech Recognition and Synthesis
- Face recognition and analysis
- Writing and Handwriting Education
- Remote-Sensing Image Classification
- Metal Forming Simulation Techniques
- Reading and Literacy Development
- Cognitive and developmental aspects of mathematical skills
- Advanced Neural Network Applications
- Indoor and Outdoor Localization Technologies
- Fatigue and fracture mechanics
- Numerical methods in engineering
- Image Retrieval and Classification Techniques
- Adversarial Robustness in Machine Learning
- Infant Health and Development
- Advanced Image and Video Retrieval Techniques
University of Hong Kong
2024
Shanghai Jiao Tong University
1990-2024
University Town of Shenzhen
2024
Tsinghua University
2024
City University of Hong Kong
2006
High-speed and accurate methods for chip-surface-defect detection remain a challenge in the semiconductor industry. Therefore, we propose Feature Fusion Data Generation based Cascade (FFDG-Cascade) approach. This method cascades classification module with an object module. The classifier screens non-defective samples high confidence, significantly mitigating number of forwarded to detector, substantially enhancing efficiency due classifiers’ higher operational speed than detectors. We...
It is well known that visual cues of lip movement contain important speech relevant information. This paper presents an automatic lipreading system for small vocabulary recognition tasks. Using the segmentation and modeling techniques we developed earlier, obtain a feature vector composed outer inner mouth features from image sequence recognition. A spline representation employed to transform discrete-time sampled video frames into continuous domain. The coefficients in same word class are...
For the image classification task, color histogram is widely used as an important feature indicating content of image. However, high-resolution histograms are usually high dimension and contain much redundant information which does not relate to content, while low-resolution cannot provide adequate discriminative for classification. In this paper, a new representation proposed only takes correlation among neighbouring components conventional into account but removes well. A high-resolution,...
Handwriting difficulty is a defining feature of Chinese developmental dyslexia (DD) due to the complex structure and dense information contained within compound characters. Despite previous attempts use deep neural network models extract handwriting features, temporal property writing characters in sequential order during dictation tasks has been neglected. By combining transfer learning convolutional (CNN) positional encoding with temporal-sequential long short-term memory (LSTM) attention...
Speech recognition solely based on visual information such as the lip shape and its movement is referred to lipreading. This paper presents an automatic lipreading technique for speaker dependent (SD) independent (SI) speech tasks. Since features are derived according frame rate of video sequence, spline representation then employed translate discrete-time sampled into continuous domain. The coefficients in same word class constrained have similar expression can be estimated from training...
Compared with some "static" biometrics such as human face and fingerprint, person authentication based on lip movement has the advantage of incorporating "dynamic" features which contain rich information indicating speaker identity. This paper proposes a new feature representation analyzes its discrimination power for authentication. Since original are usually high-dimension, independent component analysis (ICA) is adopted dimension- reduction discriminative extraction. Hidden Markov model...