StyTr$^2$: Image Style Transfer with Transformers

FOS: Computer and information sciences Computer Vision and Pattern Recognition (cs.CV) Image and Video Processing (eess.IV) Computer Science - Computer Vision and Pattern Recognition FOS: Electrical engineering, electronic engineering, information engineering 0202 electrical engineering, electronic engineering, information engineering 02 engineering and technology Electrical Engineering and Systems Science - Image and Video Processing
DOI: 10.48550/arxiv.2105.14576 Publication Date: 2021-01-01
ABSTRACT
The goal of image style transfer is to render an with artistic features guided by a reference while maintaining the original content. Owing locality in convolutional neural networks (CNNs), extracting and global information input images difficult. Therefore, traditional methods face biased content representation. To address this critical issue, we take long-range dependencies into account for proposing transformer-based approach called StyTr$^2$. In contrast visual transformers other vision tasks, StyTr$^2$ contains two different transformer encoders generate domain-specific sequences style, respectively. Following encoders, multi-layer decoder adopted stylize sequence according sequence. We also analyze deficiency existing positional encoding propose content-aware (CAPE), which scale-invariant more suitable tasks. Qualitative quantitative experiments demonstrate effectiveness proposed compared state-of-the-art CNN-based flow-based approaches. Code models are available at https://github.com/diyiiyiii/StyTR-2.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....