NFDI4DS | UHH-SEMS - Publication Details

Zhenyu Xie

ORCID: 0000-0001-9207-1014

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5060590214

Research Areas

3D Shape Modeling and Analysis
Generative Adversarial Networks and Image Synthesis
Advanced Vision and Imaging
Computer Graphics and Visualization Techniques
Magnetic Bearings and Levitation Dynamics
Human Pose and Action Recognition
Video Surveillance and Tracking Methods
Tribology and Lubrication Engineering
Human Motion and Animation
Industrial Technology and Control Systems
Image Enhancement Techniques
Visual Attention and Saliency Detection
Advanced Image Processing Techniques
Face recognition and analysis
Anomaly Detection Techniques and Applications
Martial Arts: Techniques, Psychology, and Education
Image and Video Quality Assessment
Polynomial and algebraic computation
Advanced machining processes and optimization
Aesthetic Perception and Analysis
IoT-based Smart Home Systems
Advanced Optimization Algorithms Research
Gear and Bearing Dynamics Analysis
Hand Gesture Recognition Systems
Numerical Methods and Algorithms

Sun Yat-sen University
2019-2025

Shanghai Jiao Tong University
2020-2024

University of Electronic Science and Technology of China
2015

Nanjing University of Aeronautics and Astronautics
2008-2014

Fashion Editing With Adversarial Parsing Learning

OPENALEX - Publications

Haoye Dong Xiaodan Liang Yixuan Zhang Xu‐Jie Zhang Xiaohui Shen and 3 more

Interactive fashion image manipulation, which enables users to edit images with sketches and color strokes, is an interesting research problem great application value. Existing works often treat it as a general inpainting task do not fully leverage the semantic structural information in images. Moreover, they directly utilize conventional convolution normalization layers restore incomplete image, tends wash away sketch information. In this paper, we propose novel Fashion Editing Generative...

10.1109/cvpr42600.2020.00814 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

M3D-VTON: A Monocular-to-3D Virtual Try-On Network

OPENALEX - Publications

Fuwei Zhao Zhenyu Xie Michael Kampffmeyer Haoye Dong Songfang Han and 3 more

Virtual 3D try-on can provide an intuitive and realistic view for online shopping has a huge potential commercial value. However, existing virtual methods mainly rely on annotated human shapes garment templates, which hinders their applications in practical scenarios. 2D approaches faster alternative to manipulate clothed humans, but lack the rich representation. In this paper, we propose novel Monocular-to-3D Try-On Network (M3D-VTON) that builds merits of both approaches. By integrating...

10.1109/iccv48922.2021.01299 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

Dressing in the Wild by Watching Dance Videos

OPENALEX - Publications

Xin Dong Fuwei Zhao Zhenyu Xie Xijin Zhang Daniel K. Du and 4 more

While significant progress has been made in garment transfer, one of the most applicable directions human-centric image generation, existing works overlook in-the-wild imagery, presenting severe garment-person mis-alignment as well noticeable degradation fine texture details. This paper, therefore, attends to virtual try-on real-world scenes and brings essential improvements authenticity naturalness especially for loose (e.g., skirts, formal dresses), challenging poses cross arms, bent...

10.1109/cvpr52688.2022.00347 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Towards Detailed Text-to-Motion Synthesis via Basic-to-Advanced Hierarchical Diffusion Model

OPENALEX - Publications

Zhenyu Xie Yang Wu Xuehao Gao Zhongqian Sun Wei Yang and 1 more

Text-guided motion synthesis aims to generate 3D human that not only precisely reflects the textual description but reveals details as much possible. Pioneering methods explore diffusion model for text-to-motion and obtain significant superiority. However, these conduct processes either on raw data distribution or low-dimensional latent space, which typically suffer from problem of modality inconsistency detail-scarce. To tackle this problem, we propose a novel Basic-to-Advanced Hierarchical...

10.1609/aaai.v38i6.28443 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2024-03-24

GUESS: GradUally Enriching SyntheSis for Text-Driven Human Motion Generation

OPENALEX - Publications

Xuehao Gao Yang Yang Zhenyu Xie Shaoyi Du Zhongqian Sun and 1 more

In this article, we propose a novel cascaded diffusion-based generative framework for text-driven human motion synthesis, which exploits strategy named GradUally Enriching SyntheSis (GUESS as its abbreviation). The sets up generation objectives by grouping body joints of detailed skeletons in close semantic proximity together and then replacing each such joint group with single body-part node. Such an operation recursively abstracts pose to coarser at multiple granularity levels. With...

10.1109/tvcg.2024.3352002 article EN IEEE Transactions on Visualization and Computer Graphics 2024-01-15

ViTon-GUN: Person-to-Person Virtual Try-on via Garment Unwrapping

OPENALEX - Publications

Nannan Zhang Zhenyu Xie Zhengwentai Sun Hairui Zhu Zirong Jin and 3 more

The image-based Person-to-Person (P2P) virtual try-on, involving the direct transfer of garments from one person to another, is most promising applications human-centric image generation. However, existing approaches struggle accurately learn clothing deformation when directly warping garment source pose onto target pose. To address this, we propose try-on via Garment UNwrapping, a novel framework dubbed as ViTon-GUN. Specifically, divide P2P task into two subtasks: Person-to-Garment (P2G)...

10.1109/tvcg.2025.3550776 article EN IEEE Transactions on Visualization and Computer Graphics 2025-01-01

WAS-VTON: Warping Architecture Search for Virtual Try-on Network

OPENALEX - Publications

Zhenyu Xie Xujie Zhang Fuwei Zhao Haoye Dong Michael Kampffmeyer and 2 more

Despite recent progress on image-based virtual try-on, current methods are constraint by shared warping networks and thus fail to synthesize natural try-on results when faced with clothing categories that require different operations. In this paper, we address problem finding category-specific for the task via Neural Architecture Search (NAS). We introduce a NAS-Warping Module elaborately design bilevel hierarchical search space identify optimal network-level operation-level flow estimation...

10.1145/3474085.3475490 article EN Proceedings of the 30th ACM International Conference on Multimedia 2021-10-17

ARMANI: Part-level Garment-Text Alignment for Unified Cross-Modal Fashion Design

OPENALEX - Publications

Xu‐Jie Zhang Sha Yu Michael Kampffmeyer Zhenyu Xie Zequn Jie and 3 more

Cross-modal fashion image synthesis has emerged as one of the most promising directions in generation domain due to vast untapped potential incorporating multiple modalities and wide range applications. To facilitate accurate generation, cross-modal methods typically rely on Contrastive Language-Image Pre-training (CLIP) align textual garment information. In this work, we argue that simply aligning texture information is not sufficient capture semantics visual therefore propose MaskCLIP....

10.1145/3503161.3548230 article EN Proceedings of the 30th ACM International Conference on Multimedia 2022-10-10

Towards Scalable Unpaired Virtual Try-On via Patch-Routed Spatially-Adaptive GAN

OPENALEX - Publications

Zhenyu Xie Zaiyu Huang Fuwei Zhao Haoye Dong Michael Kampffmeyer and 1 more

Image-based virtual try-on is one of the most promising applications human-centric image generation due to its tremendous real-world potential. Yet, as approaches fit in-shop garments onto a target person, they require laborious and restrictive construction paired training dataset, severely limiting their scalability. While few recent works attempt transfer directly from person another, alleviating need collect datasets, performance impacted by lack (supervised) information. In particular,...

10.48550/arxiv.2111.10544 preprint EN other-oa arXiv (Cornell University) 2021-01-01

DreamVTON: Customizing 3D Virtual Try-on with Personalized Diffusion Models

OPENALEX - Publications

Zhenyu Xie Haoye Dong Yuqian Gao Z. Ma Xiaodan Liang

10.1145/3664647.3681391 article EN 2024-10-26

Image Comes Dancing With Collaborative Parsing-Flow Video Synthesis

OPENALEX - Publications

Bowen Wu Zhenyu Xie Xiaodan Liang Yubei Xiao Haoye Dong and 1 more

Transferring human motion from a source to target person poses great potential in computer vision and graphics applications. A crucial step is manipulate sequential future while retaining the appearance characteristic. Previous work has either relied on crafted 3D models or trained separate model specifically for each person, which not scalable practice. This studies more general setting, we aim learn single parsimoniously transfer video any given only one image of named as Collaborative...

10.1109/tip.2021.3123549 article EN IEEE Transactions on Image Processing 2021-01-01

Characteristics of motorized spindle supported by active magnetic bearings

OPENALEX - Publications

Zhenyu Xie Kun Yu Liantang Wen Xiao Wang Zhou Hong-kai

Abstract A motorized spindle supported by active magnetic bearings (AMBs) is generally used for ultra-high-speed machining. Iron loss of radial AMB very great owing to high rotation speed, and it will cause severe thermal deformation. The problem particularly serious on the occasion large power application, such as all electric aero-engine. In this study, a prototype five degree-of-freedom AMBs developed. Homopolar heteropolar are independently adopted bearings. influences two types dynamic...

10.1016/j.cja.2014.10.031 article EN cc-by-nc-nd Chinese Journal of Aeronautics 2014-10-18

WarpDiffusion: Efficient Diffusion Model for High-Fidelity Virtual Try-on

OPENALEX - Publications

xujie zhang Xiu Li Michael Kampffmeyer Xin Luna Dong Zhenyu Xie and 3 more

Image-based Virtual Try-On (VITON) aims to transfer an in-shop garment image onto a target person. While existing methods focus on warping the fit body pose, they often overlook synthesis quality around garment-skin boundary and realistic effects like wrinkles shadows warped garments. These limitations greatly reduce realism of generated results hinder practical application VITON techniques. Leveraging notable success diffusion-based models in cross-modal synthesis, some recent have ventured...

10.48550/arxiv.2312.03667 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Algorithms for solving unconstrained optimization problems

OPENALEX - Publications

Ping Kuang Qin-Min Zhao Zhenyu Xie

The computational method of unconstrained optimization problem is an important research topic in the field numerical computation. It great significance to solve optimization. There are many ways that applied settle these questions, so we need choose a which owns much faster and less complex trait. Furthermore, order this rubs, paper presents comparative study common algorithms our approach used handle some concrete problems.

10.1109/iccwamtip.2015.7494013 article EN 2021 18th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP) 2015-12-01

Fashion Matrix: Editing Photos by Just Talking

OPENALEX - Publications

Zheng Chong Xujie Zhang Fuwei Zhao Zhenyu Xie Xiaodan Liang

The utilization of Large Language Models (LLMs) for the construction AI systems has garnered significant attention across diverse fields. extension LLMs to domain fashion holds substantial commercial potential but also inherent challenges due intricate semantic interactions in fashion-related generation. To address this issue, we developed a hierarchical system called Fashion Matrix dedicated editing photos by just talking. This facilitates prompt-driven tasks, encompassing garment or...

10.48550/arxiv.2307.13240 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Integrating Appearance and Spatial-Temporal Information for Multi-Camera People Tracking

OPENALEX - Publications

Wenjie Yang Zhenyu Xie Yaoming Wang Yang Zhang Xiao Ma and 1 more

Multi-Camera People Tracking (MCPT) is a crucial task in intelligent surveillance systems. However, it presents significant challenges due to issues such as heavy occlusion and variations appearance that arise from multiple camera perspectives congested scenarios. In this paper, we propose an effective system integrates both spatial-temporal information address these problems, consisting of three specially designed modules: (1) A Multi-Object (MOT) method minimizes ID-switch errors generates...

10.1109/cvprw59228.2023.00554 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2023-06-01

GUESS:GradUally Enriching SyntheSis for Text-Driven Human Motion Generation

OPENALEX - Publications

Xuehao Gao Yang Yang Zhenyu Xie Shaoyi Du Zhongqian Sun and 1 more

In this paper, we propose a novel cascaded diffusion-based generative framework for text-driven human motion synthesis, which exploits strategy named GradUally Enriching SyntheSis (GUESS as its abbreviation). The sets up generation objectives by grouping body joints of detailed skeletons in close semantic proximity together and then replacing each such joint group with single body-part node. Such an operation recursively abstracts pose to coarser at multiple granularity levels. With...

10.48550/arxiv.2401.02142 preprint EN other-oa arXiv (Cornell University) 2024-01-01

A Robust Online Multi-Camera People Tracking System With Geometric Consistency and State-aware Re-ID Correction

OPENALEX - Publications

Zhenyu Xie Zelin Ni Wenjie Yang Yuang Zhang Yihang Chen and 2 more

10.1109/cvprw63382.2024.00694 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2024-06-17

DreamVTON: Customizing 3D Virtual Try-on with Personalized Diffusion Models

OPENALEX - Publications

Zhenyu Xie Haoye Dong Yuqian Gao Z. Ma Xiaodan Liang

Image-based 3D Virtual Try-ON (VTON) aims to sculpt the human according person and clothes images, which is data-efficient (i.e., getting rid of expensive data) but challenging. Recent text-to-3D methods achieve remarkable improvement in high-fidelity generation, demonstrating its potential for virtual try-on. Inspired by impressive success personalized diffusion models (e.g., Dreambooth LoRA) 2D VTON, it straightforward VTON integrating personalization technique into diffusion-based...

10.48550/arxiv.2407.16511 preprint EN arXiv (Cornell University) 2024-07-23

Coming Soon ...