Yilin Wang

ORCID: 0000-0003-4031-8753
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Image and Video Quality Assessment
  • Advanced Image Processing Techniques
  • Bauxite Residue and Utilization
  • Image Enhancement Techniques
  • Extraction and Separation Processes
  • Advanced Image and Video Retrieval Techniques
  • Visual Attention and Saliency Detection
  • Advanced Neural Network Applications
  • Advanced Vision and Imaging
  • Industrial Vision Systems and Defect Detection
  • Generative Adversarial Networks and Image Synthesis
  • Video Analysis and Summarization
  • Recycling and utilization of industrial and municipal waste in materials production
  • Welding Techniques and Residual Stresses
  • Domain Adaptation and Few-Shot Learning
  • Aluminum Alloys Composites Properties
  • Multimodal Machine Learning Applications
  • Video Surveillance and Tracking Methods
  • Image Retrieval and Classification Techniques
  • Anomaly Detection Techniques and Applications
  • Additive Manufacturing Materials and Processes
  • Advanced Image Fusion Techniques
  • Image and Signal Denoising Methods
  • Advanced Welding Techniques Analysis
  • Computational Drug Discovery Methods

Google (United States)
2016-2025

Zhengzhou University
2025

Shandong Academy of Sciences
2024-2025

Qilu University of Technology
2023-2025

Chongqing University of Arts and Sciences
2025

Hong Kong Polytechnic University
2022-2024

University of Science and Technology Beijing
2021-2024

Adobe Systems (United States)
2020-2024

Tsinghua University
2017-2024

Jimei University
2022-2024

Image quality assessment (IQA) is an important research topic for understanding and improving visual experience. The current state-of-the-art IQA methods are based on convolutional neural networks (CNNs). performance of CNN-based models often compromised by the fixed shape constraint in batch training. To accommodate this, input images usually resized cropped to a shape, causing image degradation. address we design multi-scale Transformer (MUSIQ) process native resolution with varying sizes...

10.1109/iccv48922.2021.00510 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

We consider the problem of obtaining image quality representations in a self-supervised manner. use prediction distortion type and degree as an auxiliary task to learn features from unlabeled dataset containing mixture synthetic realistic distortions. then train deep Convolutional Neural Network (CNN) using contrastive pairwise objective solve problem. refer proposed training framework resulting IQA model CONTRastive Image QUality Evaluator (CONTRIQUE). During evaluation, CNN weights are...

10.1109/tip.2022.3181496 article EN publisher-specific-oa IEEE Transactions on Image Processing 2022-01-01

Blind or no-reference video quality assessment of user-generated content (UGC) has become a trending, challenging, heretofore unsolved problem. Accurate and efficient predictors suitable for this are thus in great demand to achieve more intelligent analysis processing UGC videos. Previous studies have shown that natural scene statistics deep learning features both sufficient capture spatial distortions, which contribute significant aspect issues. However, these models either incapable...

10.1109/ojsp.2021.3090333 article EN cc-by IEEE Open Journal of Signal Processing 2021-01-01

We develop an approach to learning visual representations that embraces multimodal data, driven by a combination of intra- and inter-modal similarity preservation objectives. Unlike existing pre-training methods, which solve proxy prediction task in single domain, our method exploits intrinsic data properties within each modality semantic information from cross-modal correlation simultaneously, hence improving the quality learned representations. By including training unified framework with...

10.1109/cvpr46437.2021.00692 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Despite the impressive representation capacity of vision transformer models, current light-weight models still suffer from inconsistent and incorrect dense predictions at local regions. We suspect that power their self-attention mechanism is limited in shallower thinner networks. propose Lite Vision Transformer (LVT), a novel network with two enhanced mechanisms to improve model performances for mobile deployment. For low-level features, we introduce Convolutional Self-Attention (CSA)....

10.1109/cvpr52688.2022.01169 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

The combination of methane steam reforming technology and CCS (Carbon Capture Storage) has great potential to reduce carbon emissions in the process hydrogen production. Different from traditional idea capturing CO2 Dioxide) exhaust gas with high work consumption, this study simultaneously focuses on separation fuel recycling. A new production system is developed by coupled capture. Separated captured high-purity dioxide could be recycled for dry reforming; basis, a...

10.1016/j.enconman.2022.116199 article EN cc-by-nc-nd Energy Conversion and Management 2022-09-18

In this paper, we presented a real-time 2D human gesture grading system from monocular images based on OpenPose, library for multi-person keypoint detection. After capturing positions of person's joints and skeleton wireframe the body, computed equation motion trajectory every joint. Similarity metric was defined as distance between trajectories standard videos. A modifiable scoring formula used simulating scenario. Experimental results showed that worked efficiently with high performance,...

10.1109/cisp-bmei.2017.8301910 article EN 2021 14th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI) 2017-10-01

We propose Mask Guided (MG) Matting, a robust matting framework that takes general coarse mask as guidance. MG Matting leverages network (PRN) design which encourages the model to provide self-guidance progressively refine uncertain regions through decoding process. A series of guidance perturbation operations are also introduced in training further enhance its robustness external show PRN can generalize unseen types masks such trimap and low-quality alpha matte, making it suitable for...

10.1109/cvpr46437.2021.00121 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Recent years have witnessed an explosion of user-generated content (UGC) videos shared and streamed over the Internet, thanks to evolution affordable reliable consumer capture devices, tremendous popularity social media platforms. Accordingly, there is a great need for accurate video quality assessment (VQA) models UGC/consumer monitor, control, optimize this vast content. Blind prediction in-the-wild quite challenging, since degradations UGC are unpredictable, complicated, often commingled....

10.1109/tip.2021.3072221 article EN IEEE Transactions on Image Processing 2021-01-01

Video quality assessment for User Generated Content (UGC) is an important topic in both industry and academia. Most existing methods only focus on one aspect of the perceptual assessment, such as technical or compression artifacts. In this paper, we create a large scale dataset to comprehensively investigate characteristics generic UGC video quality. Besides subjective ratings content labels dataset, also propose DNN-based framework thoroughly analyze importance content, quality, level Our...

10.1109/cvpr46437.2021.01323 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

The vapor pressures and the aqueous solubilities of 411 compounds with a large structural diversity were investigated using quantitative structure−property relationship (QSPR) approach. A five-descriptor equation squared correlation coefficient (R2) 0.949 for pressure six-descriptor R2 0.879 solubility obtained. All descriptors derived solely from chemical structure compounds. QSPR equations allow reliable prediction water−air partition coefficients.

10.1021/ci980022t article EN Journal of Chemical Information and Computer Sciences 1998-06-30

In this proposal, we study the problem of understanding human sentiments from large scale collection Internet images based on both image features and contextual social network information (such as friend comments user description). Despite great strides in analyzing sentiment text information, analysis behind content has largely been ignored. Thus, extend significant advances text-based prediction tasks to higher level challenge predicting underlying images. We show that neither visual nor...

10.1109/icdmw.2015.142 article EN 2015-11-01

Understanding human actions in wild videos is an important task with a broad range of applications. In this paper we propose novel approach named Hierarchical Attention Network (HAN), which enables to incorporate static spatial information, short-term motion information and long-term video temporal structures for complex action understanding. Compared recent convolutional neural network based approaches, HAN has following advantages (1) can efficiently capture longer range; (2) able reveal...

10.48550/arxiv.1607.06416 preprint EN cc-by arXiv (Cornell University) 2016-01-01

Non-professional video, commonly known as User Generated Content (UGC) has become very popular in today's video sharing applications. However, traditional metrics used compression and quality assessment, like BD-Rate PSNR, are designed for pristine originals. Thus, their accuracy drops significantly when being applied on non-pristine originals (the majority of UGC). Understanding difficulties assessment the scenario UGC is important, but there few public datasets available research. This...

10.1109/mmsp.2019.8901772 preprint EN 2019-09-01

With the development of data-driven models, deep learning has been increasingly applied in field defect detection. However, performance models is greatly restricted by costly labeling and sample scarcity. One best approaches to solve data imbalance problem increasing quantity diversity samples. Meanwhile, current based on generative adversarial network (GAN) cannot readily control category shape generated samples, which results inefficient augmentation. Thus, simultaneously achieve...

10.1109/tim.2022.3160542 article EN IEEE Transactions on Instrumentation and Measurement 2022-01-01

The Fourier Transform (FT) is a linear transformation for the primitive function. It takes some set of functions to be an orthogonal basis. Its physical meaning transfer function onto each base functions. Because it can convert between time and frequency domains, FT widely employed in many fields. Fractional (FrFT) improvement progress based on FT. This paper will define FrFT. Then distinction FrFT discussed. Finally, specific examples its application processing digital image are provided....

10.54254/2753-8818/42/20240103 article EN cc-by Theoretical and Natural Science 2024-06-24

Banding artifact, or false contouring, is a common video compression impairment that tends to appear on large flat regions in encoded videos. These staircase-shaped color bands can be very noticeable high-definition Here we study this and propose new distortion-specific no-reference quality model for predicting banding artifacts, called the Blind BANding Detector (BBAND index). BBAND inspired by human visual models. The proposed detector generate pixel-wise visibility map output severity...

10.1109/icassp40776.2020.9053634 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020-04-09

High frame rate (HFR) videos are becoming increasingly common with the tremendous popularity of live, high-action streaming content such as sports. Although HFR contents generally very high quality, bandwidth requirements make them challenging to deliver efficiently, while simultaneously maintaining their quality. To optimize trade-offs between and video in terms adaptation, it is imperative understand intricate relationship perceptual Towards advancing progression this direction we designed...

10.1109/access.2021.3100462 article EN cc-by IEEE Access 2021-01-01

Because of the increasing ease video capture, many millions consumers create and upload large volumes User-Generated-Content (UGC) videos to social streaming media sites over Internet. UGC are commonly captured by naive users having limited skills imperfect techniques, tend be afflicted mixtures highly diverse in-capture distortions. These then often uploaded for sharing onto cloud servers, where they further compressed storage transmission. Our paper tackles practical problem predicting...

10.1109/tip.2021.3107213 article EN IEEE Transactions on Image Processing 2021-01-01
Coming Soon ...