NFDI4DS | UHH-SEMS - Publication Details

Weiming Dong

ORCID: 0000-0001-6502-145X

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5069298091

Research Areas

Generative Adversarial Networks and Image Synthesis
Advanced Image and Video Retrieval Techniques
Computer Graphics and Visualization Techniques
Image Enhancement Techniques
Visual Attention and Saliency Detection
Multimodal Machine Learning Applications
Image Retrieval and Classification Techniques
Domain Adaptation and Few-Shot Learning
Advanced Neural Network Applications
Advanced Vision and Imaging
Advanced Image Processing Techniques
3D Shape Modeling and Analysis
Video Analysis and Summarization
Aesthetic Perception and Analysis
Music and Audio Processing
Face recognition and analysis
Human Motion and Animation
Advanced Image Fusion Techniques
Video Surveillance and Tracking Methods
Music Technology and Sound Studies
Image Processing Techniques and Applications
Cancer-related molecular mechanisms research
Speech and Audio Processing
3D Surveying and Cultural Heritage
Image Processing and 3D Reconstruction

Chinese Academy of Sciences
2016-2025

Institute of Automation
2016-2025

Beijing Academy of Artificial Intelligence
2020-2024

University of Chinese Academy of Sciences
2018-2024

Shandong Institute of Automation
2009-2024

University College of Applied Science
2023

Institute of Automation
2009-2021

Jilin University
2021

Shandong Institute of Business and Technology
2021

Shandong University
2021

Dynamic Refinement Network for Oriented and Densely Packed Object Detection

OPENALEX - Publications

Xingjia Pan Yuqiang Ren Kekai Sheng Weiming Dong Haolei Yuan and 3 more

Object detection has achieved remarkable progress in the past decade. However, of oriented and densely packed objects remains challenging because following inherent reasons: (1) receptive fields neurons are all axis-aligned same shape, whereas usually diverse shapes align along various directions; (2) models typically trained with generic knowledge may not generalize well to handle specific at test time; (3) limited dataset hinders development on this task. To resolve first two issues, we...

10.1109/cvpr42600.2020.01122 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

StyTr2: Image Style Transfer with Transformers

OPENALEX - Publications

Yingying Deng Fan Tang Weiming Dong Chongyang Ma Xingjia Pan and 2 more

The goal of image style transfer is to render an with artistic features guided by a reference while maintaining the original content. Owing locality in convolutional neural networks (CNNs), extracting and global information input images difficult. Therefore, traditional methods face biased content representation. To address this critical issue, we take long-range dependencies into account for proposing transformer-based approach called StyTr2. In contrast visual transformers other vision...

10.1109/cvpr52688.2022.01104 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Arbitrary Style Transfer via Multi-Adaptation Network

OPENALEX - Publications

Yingying Deng Fan Tang Weiming Dong Wen Sun Feiyue Huang and 1 more

Arbitrary style transfer is a significant topic with research value and application prospect. A desired transfer, given content image referenced painting, would render the color tone vivid stroke patterns of painting while synchronously maintaining detailed structure information. Style approaches initially learn representations references then generate stylized images guided by these representations. In this paper, we propose multi-adaptation network which involves two self-adaptation (SA)...

10.1145/3394171.3414015 article EN Proceedings of the 30th ACM International Conference on Multimedia 2020-10-12

Inversion-based Style Transfer with Diffusion Models

OPENALEX - Publications

Yuxin Zhang Nisha Huang Fan Tang Haibin Huang Chongyang Ma and 2 more

The artistic style within a painting is the means of expression, which includes not only material, colors, and brushstrokes, but also high-level attributes, including semantic elements object shapes. Previous arbitrary example-guided image generation methods often fail to control shape changes or convey elements. Pre-trained text-to-image synthesis diffusion probabilistic models have achieved remarkable quality require extensive textual descriptions accurately portray attributes particular...

10.1109/cvpr52729.2023.00978 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

Domain Enhanced Arbitrary Image Style Transfer via Contrastive Learning

OPENALEX - Publications

Yuxin Zhang Fan Tang Weiming Dong Haibin Huang Chongyang Ma and 2 more

In this work, we tackle the challenging problem of arbitrary image style transfer using a novel feature representation learning method. A suitable representation, as key component in stylization tasks, is essential to achieve satisfactory results. Existing deep neural network based approaches reasonable results with guidance from second-order statistics such Gram matrix content features. However, they do not leverage sufficient information, which artifacts local distortions and...

10.1145/3528233.3530736 preprint EN 2022-07-20

Arbitrary Video Style Transfer via Multi-Channel Correlation

OPENALEX - Publications

Yingying Deng Fan Tang Weiming Dong Haibin Huang Chongyang Ma and 1 more

Video style transfer is attracting increasing attention from the artificial intelligence community because of its numerous applications, such as augmented reality and animation production. Relative to traditional image transfer, video presents new challenges, including how effectively generate satisfactory stylized results for any specified while maintaining temporal coherence across frames. Towards this end, we propose a Multi-Channel Correlation network (MCCNet), which can be trained fuse...

10.1609/aaai.v35i2.16208 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2021-05-18

Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer

OPENALEX - Publications

Yifan Xu Zhijie Zhang Mengdan Zhang Kekai Sheng Ke Li and 4 more

Vision transformers (ViTs) have recently received explosive popularity, but the huge computational cost is still a severe issue. Since computation complexity of ViT quadratic with respect to input sequence length, mainstream paradigm for reduction reduce number tokens. Existing designs include structured spatial compression that uses progressive shrinking pyramid computations large feature maps, and unstructured token pruning dynamically drops redundant However, limitation existing lies in...

10.1609/aaai.v36i3.20202 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2022-06-28

Optimized image resizing using seam carving and scaling

OPENALEX - Publications

Weiming Dong Ning Zhou Jean‐Claude Paul Xiaopeng Zhang

We present a novel method for content-aware image resizing based on optimization of well-defined distance function, which preserves both the important regions and global visual effect (the background or other decorative objects) an image. The operates by joint use seam carving scaling. principle behind our is bidirectional similarity function Euclidean (IMED), while cooperating with dominant color descriptor (DCD) energy variation. suitable quantitative evaluation result determination best...

10.1145/1618452.1618471 article EN ACM Transactions on Graphics 2009-12-01

Segment-Tree Based Cost Aggregation for Stereo Matching

OPENALEX - Publications

Xing Mei Xun Sun Weiming Dong Haitao Wang Xiaopeng Zhang

This paper presents a novel tree-based cost aggregation method for dense stereo matching. Instead of employing the minimum spanning tree (MST) and its variants, new structure, "Segment-Tree", is proposed non-local matching aggregation. Conceptually, segment-tree constructed in three-step process: first, pixels are grouped into set segments with reference color or intensity image, second, graph created each segment, final step, these independent segment graphs linked to form structure. In...

10.1109/cvpr.2013.47 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2013-06-01

Attention-based Multi-Patch Aggregation for Image Aesthetic Assessment

OPENALEX - Publications

Kekai Sheng Weiming Dong Chongyang Ma Xing Mei Feiyue Huang and 1 more

Aggregation structures with explicit information, such as image attributes and scene semantics, are effective popular for intelligent systems assessing aesthetics of visual data. However, useful information may not be available due to the high cost manual annotation expert design. In this paper, we present a novel multi-patch (MP) aggregation method aesthetic assessment. Different from state-of-the-art methods, which augment an MP network various attributes, train model in end-to-end manner...

10.1145/3240508.3240554 article EN Proceedings of the 30th ACM International Conference on Multimedia 2018-10-15

Transformers in computational visual media: A survey

OPENALEX - Publications

Yifan Xu Huapeng Wei Minxuan Lin Yingying Deng Kekai Sheng and 5 more

Abstract Transformers, the dominant architecture for natural language processing, have also recently attracted much attention from computational visual media researchers due to their capacity long-range representation and high performance. Transformers are sequence-to-sequence models, which use a self-attention mechanism rather than RNN sequential structure. Thus, such models can be trained in parallel represent global information. This study comprehensively surveys recent transformer works....

10.1007/s41095-021-0247-3 article EN cc-by Computational Visual Media 2021-10-27

Unveiling the Potential of Structure Preserving for Weakly Supervised Object Localization

OPENALEX - Publications

Xingjia Pan Yingguo Gao Zhiwen Lin Fan Tang Weiming Dong and 3 more

Weakly supervised object localization (WSOL) remains an open problem given the deficiency of finding extent information using a classification network. Although prior works struggled to localize objects through various spatial regularization strategies, we argue that how extract structural from trained network is neglected. In this paper, propose two-stage approach, termed structure-preserving activation (SPA), toward fully leveraging structure incorporated in convolutional features for...

10.1109/cvpr46437.2021.01147 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models

OPENALEX - Publications

Yuxin Zhang Weiming Dong Fan Tang Nisha Huang Haibin Huang and 4 more

Personalizing generative models offers a way to guide image generation with user-provided references. Current personalization methods can invert an object or concept into the textual conditioning space and compose new natural sentences for text-to-image diffusion models. However, representing editing specific visual attributes such as material, style, layout remains challenge, leading lack of disentanglement editability. To address this problem, we propose novel approach that leverages...

10.1145/3618342 article EN cc-by ACM Transactions on Graphics 2023-12-05

Understanding and Mitigating Overfitting in Prompt Tuning for Vision-Language Models

OPENALEX - Publications

Chengcheng Ma Yang Liu Jiankang Deng Lingxi Xie Weiming Dong and 1 more

Pretrained vision-language models (VLMs) such as CLIP have shown impressive generalization capability in downstream vision tasks with appropriate text prompts. Instead of designing prompts manually, Context Optimization (CoOp) has been recently proposed to learn continuous using task-specific training data. Despite the performance improvements on tasks, several studies reported that CoOp suffers from overfitting issue two aspects: (i) test accuracy base classes first improves and then...

10.1109/tcsvt.2023.3245584 article EN IEEE Transactions on Circuits and Systems for Video Technology 2023-02-16

DiffStyler: Controllable Dual Diffusion for Text-Driven Image Stylization

OPENALEX - Publications

Nisha Huang Yuxin Zhang Fan Tang Chongyang Ma Haibin Huang and 2 more

Despite the impressive results of arbitrary image-guided style transfer methods, text-driven image stylization has recently been proposed for transferring a natural into stylized one according to textual descriptions target provided by user. Unlike previous image-to-image approaches, text-guided progress provides users with more precise and intuitive way express desired style. However, huge discrepancy between cross-modal inputs/outputs makes it challenging conduct in typical feed-forward...

10.1109/tnnls.2023.3342645 article EN IEEE Transactions on Neural Networks and Learning Systems 2024-01-10

Inverse procedural modeling of facade layouts

OPENALEX - Publications

Fuzhang Wu Dong‐Ming Yan Weiming Dong Xiaopeng Zhang Peter Wonka

In this paper, we address the following research problem: How can generate a meaningful split grammar that explains given facade layout? To evaluate if is meaningful, propose cost function based on description length and minimize using an approximate dynamic programming framework. Our evaluation indicates our framework extracts grammars are competitive with those of expert users, while some users all competing automatic solutions less successful.

10.1145/2601097.2601162 article EN ACM Transactions on Graphics 2014-07-22

Content‐Based Colour Transfer

OPENALEX - Publications

Fuzhang Wu Weiming Dong Yan Kong Xing Mei Jean‐Claude Paul and 1 more

Abstract This paper presents a novel content‐based method for transferring the colour patterns between images. Unlike previous methods that rely on image statistics, our puts an emphasis high‐level scene content analysis. We first automatically extract foreground subject areas and background layout from scene. The semantic correspondences of regions source target images are established. In second step, is re‐coloured in optimization framework, which incorporates extracted information spatial...

10.1111/cgf.12008 article EN Computer Graphics Forum 2013-01-11

Flower classification via convolutional neural network

OPENALEX - Publications

Yuanyuan Liu Fan Tang Dengwen Zhou Yiping Meng Weiming Dong

In this paper, we address the problem of natural flower classification. It is a challenging task due to non-rigid deformation, illumination changes, and inter-class similarity. We build large dataset images in wide with 79 categories propose novel framework based on convolutional neural network (CNN) solve problem. Unlike other methods using hand-crafted visual features, our method utilizes automatically learn good features for The consists five layers where small receptive fields are...

10.1109/fspma.2016.7818296 article EN 2016-11-01

Exploring the Temporal Consistency of Arbitrary Style Transfer: A Channelwise Perspective

OPENALEX - Publications

Xiaoyu Kong Yingying Deng Fan Tang Weiming Dong Chongyang Ma and 3 more

Arbitrary image stylization by neural networks has become a popular topic, and video is attracting more attention as an extension of stylization. However, when methods are applied to videos, unsatisfactory results that suffer from severe flickering effects appear. In this article, we conducted detailed comprehensive analysis the cause such effects. Systematic comparisons among typical style transfer approaches show feature migration modules for state-of-the-art (SOTA) learning systems...

10.1109/tnnls.2022.3230084 article EN IEEE Transactions on Neural Networks and Learning Systems 2023-01-06

Weakly-Supervised Deep Convolutional Neural Network Learning for Facial Action Unit Intensity Estimation

OPENALEX - Publications

Yong Zhang Weiming Dong Bao-Gang Hu Qiang Ji

Facial action unit (AU) intensity estimation plays an important role in affective computing and human-computer interaction. Recent works have introduced deep neural networks for AU estimation, but they require a large amount of annotations. annotation needs strong domain expertise it is expensive to construct database learn models. We propose novel knowledge-based semi-supervised convolutional network with extremely limited Only the annotations peak valley frames training sequences are...

10.1109/cvpr.2018.00246 article EN 2018-06-01

Classifier Learning with Prior Probabilities for Facial Action Unit Recognition

OPENALEX - Publications

Yong Zhang Weiming Dong Bao-Gang Hu Qiang Ji

Facial action units (AUs) play an important role in human emotion understanding. One big challenge for data-driven AU recognition approaches is the lack of enough annotations, since annotation requires strong domain expertise. To alleviate this issue, we propose a knowledge-driven method jointly learning multiple classifiers without any by leveraging prior probabilities on AUs, including expression-independent and expression-dependent probabilities. These are drawn from facial anatomy...

10.1109/cvpr.2018.00536 article EN 2018-06-01

Bilateral Ordinal Relevance Multi-instance Regression for Facial Action Unit Intensity Estimation

OPENALEX - Publications

Yong Zhang Rui Zhao Weiming Dong Bao-Gang Hu Qiang Ji

Automatic intensity estimation of facial action units (AUs) is challenging in two aspects. First, capturing subtle changes appearance quite difficult. Second, the annotation AU scarce and expensive. Intensity requires strong domain knowledge thus only experts are qualified. The majority methods directly apply supervised learning techniques to while few exploit unlabeled samples improve performance. In this paper, we propose a novel weakly regression model-Bilateral Ordinal Relevance...

10.1109/cvpr.2018.00735 article EN 2018-06-01

User-Guided Personalized Image Aesthetic Assessment Based on Deep Reinforcement Learning

OPENALEX - Publications

Pei Lv Jianqi Fan Xixi Nie Weiming Dong Xiaoheng Jiang and 3 more

Personalized image aesthetic assessment (PIAA) has recently become a hot topic due to its wide applications, such as photography, film, television, e-commerce, fashion design, and so on. This task is more seriously affected by subjective factors samples provided users. In order acquire precise personalized distribution small amount of samples, we propose novel user-guided framework. framework leverages user interactions retouch rank images for based on deep reinforcement learning (DRL),...

10.1109/tmm.2021.3130752 article EN IEEE Transactions on Multimedia 2021-11-25

A Unified Arbitrary Style Transfer Framework via Adaptive Contrastive Learning

OPENALEX - Publications

Yuxin Zhang Fan Tang Weiming Dong Haibin Huang Chongyang Ma and 2 more

This work presents Unified Contrastive Arbitrary Style Transfer (UCAST), a novel style representation learning and transfer framework, that can fit in most existing arbitrary image models, such as CNN-based, ViT-based, flow-based methods. As the key component tasks, suitable is essential to achieve satisfactory results. Existing approaches based on deep neural networks typically use second-order statistics generate output. However, these hand-crafted features computed from single cannot...

10.1145/3605548 article EN ACM Transactions on Graphics 2023-06-20

Z*: Zero-shot Style Transfer via Attention Reweighting

OPENALEX - Publications

Yingying Deng Xiangyu He Fan Tang Weiming Dong

10.1109/cvpr52733.2024.00662 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

Coming Soon ...