- Advanced Vision and Imaging
- Video Coding and Compression Technologies
- Advanced Data Compression Techniques
- Image Enhancement Techniques
- Image and Video Quality Assessment
- Face recognition and analysis
- Image and Signal Denoising Methods
- Face and Expression Recognition
- Advanced Image Processing Techniques
- Advanced Neural Network Applications
- Video Surveillance and Tracking Methods
- Digital Filter Design and Implementation
- Neural Networks and Applications
- Advanced Decision-Making Techniques
- Infrastructure Maintenance and Monitoring
- Numerical Methods and Algorithms
- Advanced Image and Video Retrieval Techniques
- Gaze Tracking and Assistive Technology
- Industrial Vision Systems and Defect Detection
- Advanced Computational Techniques and Applications
- Spectroscopy Techniques in Biomedical and Chemical Research
- Advanced Algorithms and Applications
- Hand Gesture Recognition Systems
- Advanced Wireless Communication Techniques
- Advancements in PLL and VCO Technologies
National Cheng Kung University
2003-2019
Beijing University of Technology
2015
The H.264 advanced video coding (H.264/AVC) standard provides several features such as improved efficiency and error robustness for storage transmission. In order to improve the performance of H.264/AVC, control parameters group-of-pictures (GOP) sizes should be adaptively adjusted according different content variations (VCVs), which can extracted from temporal deviation between two consecutive frames. authors present a simple VCV estimation design adaptive GOP detection (AGD) scene change...
In this paper, we present efficient recursive architectures for realizing the modified discrete cosine transform (MDCT) and inverse MDCT (IMDCT) acquired in many audio coding systems. After data rearrangement, IMDCT can be represented as Chebyshev polynomials such that efficiently implement them structures. For verification, design an ASIC to realize IMDCT. The analyzed results show proposed infinite-impulse response (IIR) structures possess advantages of high efficiency throughput rate....
For future 3D TV broadcasting systems and navigation applications, it is necessary to have accurate stereo matching which could precisely estimate depth map from two distanced cameras. In this paper, we first suggest a trinary cross color (TCC) census transform, can help achieve disparity raw cost with low computational cost. The two-pass aggregation (TPCA) formed compute the cost, then be obtained by range winner-take-all (RWTA) process white hole filling procedure. To further enhance...
This paper presents recursive architectures for the modified discrete cosine transform (MDCT) and its inverse (IMDCT) which are most complex operations in layer 3 of MPEG audio coding standard. By rearranging input data, we first derive two trigonometric equations, can be represented as Chebyshev polynomials. Then demonstrate that general length MDCT IMDCT efficiently implemented by structure. The computational complexity each data throughput these is less than existing related systems many...
In this paper, a novel class-specific kernel linear regression classification is proposed for face recognition under very low-resolution and severe illumination variation conditions. Since the problem coupled with variations makes ill-posed data distribution, nonlinear projection rendered by function would enhance modeling capability of distribution. The explicit knowledge mapping can be avoided using trick. To reduce redundancy, low-rank-r approximation suggested to make feasible...
For smart living applications, personal identification as well behavior and emotion detection becomes more important in our daily life. identity classification facial expression detection, features extracted from face images are the most popular low-cost information. The shape terms of landmarks estimated by a alignment method can be used for many applications including virtual animation real classification. In this paper, we propose robust based on multi-feature regression (MSR), which is...
A successive termination and elimination (STE) method to achieve fast inter mode decision is proposed. The detection starts from residual homogeneous then spatial performed for each 16×16 macroblock. For either the or case, authors can directly terminate prediction choose as best mode. non-homogeneous cases, carry out 8×8 subblock motion estimation. Based on cost analyses of modes, method, which could help remove unlikely 8×16 16×8 also suggested. Similarly, STE block be applied decide if...
In this paper, a simple fuzzy-based algorithm to remove the impulse noise from images is proposed. To achieve real-time applications, proposed filter architecture, which combines fuzzy detection and filtering, also designed. With low computational complexity, simulation results show that filters effectively noise.
In this paper, we propose an approximate square criterion for H.264/AVC intra mode decision. The sum of difference (SSD) achieves the best video quality but takes much high computation due to operation. A (SASD) is proposed maintain and reduce computation. By applying characteristic SSD criterion, simulation results show rate-distortion performance SASD close that method. For hardware implementation, synthesized operation respectively reduces 75% 61% in area cost timing delay than function.
The ACELP method makes use of multipulse structure to represent the excitation pulses residual signal. With purpose computational complexity reduction, this paper provides maximum-take-precedence (MTP-ACELP) search under acceptable degradation in performance. Because maximum target signal is preferentially compensated, performance would be diminished. By predicting locations pulses, reduced. We not only reduce possible pulse combinations procedure but also avoid computation useless...
Model-based video coding has been adopted as a core experiment in the ISO MPEG-4 standard. The clip-and-paste technique for putting objects line is an important tool to reduce transmission rate. To assist clip-and-pasting method fitting into 2-D model, we propose several smoothing algorithms improving quality of reconstructed images. In this paper, proposed can adjust deformation zoom, tilt, and rotation object A luminance algorithm also applied compensate light source variations. Simulation...
To improve the discriminant nearest feature space analysis (DNFSA) methods [6], in this paper, we propose an improved DNFSA (IDNFSA) algorithm to increase robustness for variable lighting face recognition. The IDNFSA removes mean of each image and attempts minimize within-class (FS) distance maximize between-class FS simultaneously. In IDNFSA, first n eigenvectors are dropped a generalized whitening transformation is suggested. recognition phase, projected coefficients classified by rule...
Hardware designs that can support multiple standards are required for versatile media players. The study proposes a unified inverse transform architecture be efficiently used in Moving Picture Expert Group and ITU International Telecommunication Standardisation Sector (ITU-T) H.264/advanced video coding (AVC), Microsoft codec 1 (VC-1) Chinese Audio Video Coding Standard (AVS) decoders. For H.264/AVC 8-, 4- 2-point transforms, the computational complexity proposed is similar to defined...
Stereo matching of two distanced cameras and structured-light RGB-D are the common ways to capture depth map, which conveys per-pixel information image. However, results with mismatched occluded pixels would not provide accurately well-matched image information. The depth-image relations degrade performances view syntheses seriously in modern-day three-dimension video applications. Therefore, how effectively utilize enhance themselves becomes more important. In this paper, we propose an...
In this paper, a modified bit-rate estimation method is proposed to reduce the computation for 4×4 intra mode decision of H.264/AVC video encoder. The number coded bits modeled by linear combination existing coding parameters, which are highly related entropy H.264/AVC. Furthermore, improve accuracy estimation, scheme made adaptive information obtained from previously blocks. Comparing original rate distortion optimized (RDO) encoding process, needs calculate actual encoded each mode, can...
In this paper, we discuss an approach for designing the computational neural network, which is mainly composed of a hardlimiter neuron, updated and search function to solve some problems. The computation-by-search scheme can effectively complicated problems in condition that their functions be easily obtainable by existing networks. convergence suggested networks achieve solution are discussed analyzed. Both theoretical analyses simulated results show proposed network such they belong...
The authors present an effective spectral envelope (SE) quantisation scheme for parametric speech coders, based on human hearing properties. variable-dimension SE uniformly sampled vector in frequency is first converted into a fixed, but small, number of nonlinearly spaced bands the Bark scale. minimum distortion (BSD) criterion applied to enable hearing-based (HSEVQ) quantise vector, achieving slightly better perceptual quality than traditional method. A simplified HSEVQ (SSEVQ) developed...
This paper presents a real-time vision-interactive guiding system, which could be interactive with users based on the computer vision technology. A front-view face detection using Harr-like features is used to decide when system should wake up and become user. After initialization, some feature points within detected area are going found. Then orientation of user's head will estimated via pyramidal Lucas-Kanade optical flow tracking. Compared traditional our has more flexibility. Guiding...