- Image and Video Quality Assessment
- Advanced Image Processing Techniques
- Bauxite Residue and Utilization
- Image Enhancement Techniques
- Extraction and Separation Processes
- Advanced Image and Video Retrieval Techniques
- Visual Attention and Saliency Detection
- Advanced Neural Network Applications
- Advanced Vision and Imaging
- Industrial Vision Systems and Defect Detection
- Generative Adversarial Networks and Image Synthesis
- Video Analysis and Summarization
- Recycling and utilization of industrial and municipal waste in materials production
- Welding Techniques and Residual Stresses
- Domain Adaptation and Few-Shot Learning
- Aluminum Alloys Composites Properties
- Multimodal Machine Learning Applications
- Video Surveillance and Tracking Methods
- Image Retrieval and Classification Techniques
- Anomaly Detection Techniques and Applications
- Additive Manufacturing Materials and Processes
- Advanced Image Fusion Techniques
- Image and Signal Denoising Methods
- Advanced Welding Techniques Analysis
- Computational Drug Discovery Methods
Google (United States)
2016-2025
Zhengzhou University
2025
Shandong Academy of Sciences
2024-2025
Qilu University of Technology
2023-2025
Chongqing University of Arts and Sciences
2025
Hong Kong Polytechnic University
2022-2024
University of Science and Technology Beijing
2021-2024
Adobe Systems (United States)
2020-2024
Tsinghua University
2017-2024
Jimei University
2022-2024
Image quality assessment (IQA) is an important research topic for understanding and improving visual experience. The current state-of-the-art IQA methods are based on convolutional neural networks (CNNs). performance of CNN-based models often compromised by the fixed shape constraint in batch training. To accommodate this, input images usually resized cropped to a shape, causing image degradation. address we design multi-scale Transformer (MUSIQ) process native resolution with varying sizes...
We consider the problem of obtaining image quality representations in a self-supervised manner. use prediction distortion type and degree as an auxiliary task to learn features from unlabeled dataset containing mixture synthetic realistic distortions. then train deep Convolutional Neural Network (CNN) using contrastive pairwise objective solve problem. refer proposed training framework resulting IQA model CONTRastive Image QUality Evaluator (CONTRIQUE). During evaluation, CNN weights are...
Blind or no-reference video quality assessment of user-generated content (UGC) has become a trending, challenging, heretofore unsolved problem. Accurate and efficient predictors suitable for this are thus in great demand to achieve more intelligent analysis processing UGC videos. Previous studies have shown that natural scene statistics deep learning features both sufficient capture spatial distortions, which contribute significant aspect issues. However, these models either incapable...
We develop an approach to learning visual representations that embraces multimodal data, driven by a combination of intra- and inter-modal similarity preservation objectives. Unlike existing pre-training methods, which solve proxy prediction task in single domain, our method exploits intrinsic data properties within each modality semantic information from cross-modal correlation simultaneously, hence improving the quality learned representations. By including training unified framework with...
Despite the impressive representation capacity of vision transformer models, current light-weight models still suffer from inconsistent and incorrect dense predictions at local regions. We suspect that power their self-attention mechanism is limited in shallower thinner networks. propose Lite Vision Transformer (LVT), a novel network with two enhanced mechanisms to improve model performances for mobile deployment. For low-level features, we introduce Convolutional Self-Attention (CSA)....
The combination of methane steam reforming technology and CCS (Carbon Capture Storage) has great potential to reduce carbon emissions in the process hydrogen production. Different from traditional idea capturing CO2 Dioxide) exhaust gas with high work consumption, this study simultaneously focuses on separation fuel recycling. A new production system is developed by coupled capture. Separated captured high-purity dioxide could be recycled for dry reforming; basis, a...
In this paper, we presented a real-time 2D human gesture grading system from monocular images based on OpenPose, library for multi-person keypoint detection. After capturing positions of person's joints and skeleton wireframe the body, computed equation motion trajectory every joint. Similarity metric was defined as distance between trajectories standard videos. A modifiable scoring formula used simulating scenario. Experimental results showed that worked efficiently with high performance,...
We propose Mask Guided (MG) Matting, a robust matting framework that takes general coarse mask as guidance. MG Matting leverages network (PRN) design which encourages the model to provide self-guidance progressively refine uncertain regions through decoding process. A series of guidance perturbation operations are also introduced in training further enhance its robustness external show PRN can generalize unseen types masks such trimap and low-quality alpha matte, making it suitable for...
Recent years have witnessed an explosion of user-generated content (UGC) videos shared and streamed over the Internet, thanks to evolution affordable reliable consumer capture devices, tremendous popularity social media platforms. Accordingly, there is a great need for accurate video quality assessment (VQA) models UGC/consumer monitor, control, optimize this vast content. Blind prediction in-the-wild quite challenging, since degradations UGC are unpredictable, complicated, often commingled....
Video quality assessment for User Generated Content (UGC) is an important topic in both industry and academia. Most existing methods only focus on one aspect of the perceptual assessment, such as technical or compression artifacts. In this paper, we create a large scale dataset to comprehensively investigate characteristics generic UGC video quality. Besides subjective ratings content labels dataset, also propose DNN-based framework thoroughly analyze importance content, quality, level Our...
The vapor pressures and the aqueous solubilities of 411 compounds with a large structural diversity were investigated using quantitative structure−property relationship (QSPR) approach. A five-descriptor equation squared correlation coefficient (R2) 0.949 for pressure six-descriptor R2 0.879 solubility obtained. All descriptors derived solely from chemical structure compounds. QSPR equations allow reliable prediction water−air partition coefficients.
In this proposal, we study the problem of understanding human sentiments from large scale collection Internet images based on both image features and contextual social network information (such as friend comments user description). Despite great strides in analyzing sentiment text information, analysis behind content has largely been ignored. Thus, extend significant advances text-based prediction tasks to higher level challenge predicting underlying images. We show that neither visual nor...
Understanding human actions in wild videos is an important task with a broad range of applications. In this paper we propose novel approach named Hierarchical Attention Network (HAN), which enables to incorporate static spatial information, short-term motion information and long-term video temporal structures for complex action understanding. Compared recent convolutional neural network based approaches, HAN has following advantages (1) can efficiently capture longer range; (2) able reveal...
Non-professional video, commonly known as User Generated Content (UGC) has become very popular in today's video sharing applications. However, traditional metrics used compression and quality assessment, like BD-Rate PSNR, are designed for pristine originals. Thus, their accuracy drops significantly when being applied on non-pristine originals (the majority of UGC). Understanding difficulties assessment the scenario UGC is important, but there few public datasets available research. This...
With the development of data-driven models, deep learning has been increasingly applied in field defect detection. However, performance models is greatly restricted by costly labeling and sample scarcity. One best approaches to solve data imbalance problem increasing quantity diversity samples. Meanwhile, current based on generative adversarial network (GAN) cannot readily control category shape generated samples, which results inefficient augmentation. Thus, simultaneously achieve...
The Fourier Transform (FT) is a linear transformation for the primitive function. It takes some set of functions to be an orthogonal basis. Its physical meaning transfer function onto each base functions. Because it can convert between time and frequency domains, FT widely employed in many fields. Fractional (FrFT) improvement progress based on FT. This paper will define FrFT. Then distinction FrFT discussed. Finally, specific examples its application processing digital image are provided....
Banding artifact, or false contouring, is a common video compression impairment that tends to appear on large flat regions in encoded videos. These staircase-shaped color bands can be very noticeable high-definition Here we study this and propose new distortion-specific no-reference quality model for predicting banding artifacts, called the Blind BANding Detector (BBAND index). BBAND inspired by human visual models. The proposed detector generate pixel-wise visibility map output severity...
High frame rate (HFR) videos are becoming increasingly common with the tremendous popularity of live, high-action streaming content such as sports. Although HFR contents generally very high quality, bandwidth requirements make them challenging to deliver efficiently, while simultaneously maintaining their quality. To optimize trade-offs between and video in terms adaptation, it is imperative understand intricate relationship perceptual Towards advancing progression this direction we designed...
Because of the increasing ease video capture, many millions consumers create and upload large volumes User-Generated-Content (UGC) videos to social streaming media sites over Internet. UGC are commonly captured by naive users having limited skills imperfect techniques, tend be afflicted mixtures highly diverse in-capture distortions. These then often uploaded for sharing onto cloud servers, where they further compressed storage transmission. Our paper tackles practical problem predicting...