- Visual Attention and Saliency Detection
- Image and Video Quality Assessment
- Advanced Image Processing Techniques
- Advanced Vision and Imaging
- Multisensory perception and integration
- Neural Networks and Applications
- Video Surveillance and Tracking Methods
- Advanced Image and Video Retrieval Techniques
- Advanced Neural Network Applications
- Optical Systems and Laser Technology
- Advanced SAR Imaging Techniques
- Sport and Mega-Event Impacts
- Gaze Tracking and Assistive Technology
- Media Influence and Health
- Gene expression and cancer classification
- Diverse Approaches in Healthcare and Education Studies
- Hand Gesture Recognition Systems
- Advanced Data Processing Techniques
- Mineral Processing and Grinding
- Color perception and design
- Video Coding and Compression Technologies
- Advanced Image Fusion Techniques
- Face Recognition and Perception
- Fractal and DNA sequence analysis
- Olfactory and Sensory Function Studies
Beihang University
2018-2024
Alibaba Group (United States)
2022
University of Maine
2002
Panoramic video provides immersive and interactive experience by enabling humans to control the field of view (FoV) through head movement (HM). Thus, HM plays a key role in modeling human attention on panoramic video. This paper establishes database collecting subjects' sequences. From this database, we find that data are highly consistent across subjects. Furthermore, deep reinforcement learning (DRL) can be applied predict positions, via maximizing reward imitating scanpaths agent's...
Salient object ranking (SOR) aims to segment salient objects in an image and simultaneously predict their saliency rankings, according the shifted human attention over different objects. The existing SOR approaches mainly focus on object-based attention, e.g., semantic appearance of object. However, we find that scene context plays a vital role SOR, which same varies lot at scenes. In this paper, thus make first attempt towards explicitly learning for SOR. Specifically, establish large-scale...
Saliency prediction in traditional images and videos has drawn extensive research interests recent years. Few works have been proposed for saliency over 360° videos. They focus on directly predicting fixations the whole panorama. When viewing videos, a person can only observe content her viewport, which means that fraction of scene be seen at any given time. In this paper, we study human attention viewport propose novel visual model, dubbed saliency, to predict Two contributions are...
This paper reviews the NTIRE 2022 Challenge on Super-Resolution and Quality Enhancement of Compressed Video. In this challenge, we proposed LDV 2.0 dataset, which includes dataset (240 videos) 95 additional videos. challenge three tracks. Track 1 aims at enhancing videos compressed by HEVC a fixed QP. 2 3 target both super-resolution quality enhancement video. They require x2 x4 super-resolution, respectively. The tracks totally attract more than 600 registrations. test phase, 8 teams, teams...
With the booming development of smart devices, mobile videos have drawn broad interest when humans surf social media. Different from traditional long-form videos, are featured with uncertain human attention behavior so far owing to specific displaying mode, thus promoting research on saliency prediction for videos. Unfortunately, current eye-tracking experiments not applicable since stationary eye-tracker and eye fixation acquisition dedicated presented computers. To tackle this issue, we...
As a widely studied task, video restoration aims to enhance the quality of videos with multiple potential degradations, such as noises, blurs and compression artifacts. Among restorations, compressed enhancement super-resolution are two main tacks significant values in practical scenarios. Recently, recurrent neural networks transformers attract increasing research interests this field, due their impressive capability sequence-to-sequence modeling. However, training these models is not only...
With the remarkable success of deep learning, image/video coding for machines (VCM) has been playing an important role in facilitating intelligent vision tasks. However, existing VCM methods suffer from either sub-optimality using image compression standards, or generalisation issues learning-based methods. To address these issues, this paper proposes a residual-based hierarchical feature (RHFC) method to achieve optimal and universal object detection segmentation. More specifically, we...
Predicting video saliency is crucial for improving sports processing efficiency, thereby providing an enriched viewing experience a wide-ranging audience. However, there long-term absence of well-established eye-tracking database and learning-based approach, particularly tailored videos. In this paper, we establish large-scale dubbed audio-visual (AVS). AVS consists 1,000 high-quality videos with eye fixations from 60 participants. Through the data analysis on AVS, observe that human...
This paper presents a radial basis function (RBF) network for prediction of continuous wood pulp delignification factor. In making process, the quality is measured by K# which related to lignin content remaining in pulp. Availability an accurate during any time digester operation very critical control and saving million dollars reducing energy raw material consumption. To assure operation, currently human experts who analyze samples plant's laboratory then decide how process variables....
This is an experimental study to compare the performance of widespread backpropagation network (BP) a radial basis function (RBF) and generalized regression neural (GRNN) for potential use as on-line process models. Criteria comparison include generalization ability unseen data, robustness shifts, with sparse training computational demands.
This paper presents the results of our experiments for classification mouse chromosomes using a radial basis function (RBF) and probabilistic neural network (PNN). The fast orthogonal search (FOS) was utilized training RBF network. There were 840 540 testing chromosomes. best error rate recorded at 16.4% result is better than available 18.3% which achieved with much more
Visual and audio events simultaneously occur both attract attention. However, most existing saliency prediction works ignore the influence of only consider vision modality. In this paper, we propose a multitask learning method for visual-audio sound source localization on multi-face video by leveraging visual, face information. Specifically, first introduce large-scale database in condition (MVVA), containing eye-tracking data annotations. Using database, find that influences human...
As a widely studied task, video restoration aims to enhance the quality of videos with multiple potential degradations, such as noises, blurs and compression artifacts. Among restorations, compressed enhancement super-resolution are two main tacks significant values in practical scenarios. Recently, recurrent neural networks transformers attract increasing research interests this field, due their impressive capability sequence-to-sequence modeling. However, training these models is not only...
This paper reviews the NTIRE 2022 Challenge on Super-Resolution and Quality Enhancement of Compressed Video. In this challenge, we proposed LDV 2.0 dataset, which includes dataset (240 videos) 95 additional videos. challenge three tracks. Track 1 aims at enhancing videos compressed by HEVC a fixed QP. 2 3 target both super-resolution quality enhancement video. They require x2 x4 super-resolution, respectively. The tracks totally attract more than 600 registrations. test phase, 8 teams, teams...