NFDI4DS | UHH-SEMS - Publication Details

Audio-Infused Automatic Image Colorization by Exploiting Audio Scene Semantics

Audio visual Modality (human–computer interaction) Representation

DOI: 10.48550/arxiv.2401.13270 Publication Date: 2024-01-01

Abstract Supplemental Material References Cited by

AUTHORS (7)

Pengcheng Zhao

Yanxiang Chen

Yang Zhao

Wei Jia

Zhao Zhang

Ronggang Wang

Richang Hong

ABSTRACT

Automatic image colorization is inherently an ill-posed problem with uncertainty, which requires accurate semantic understanding of scenes to estimate reasonable colors for grayscale images. Although recent interaction-based methods have achieved impressive performance, it still a very difficult task infer realistic and automatic colorization. To reduce the difficulty scenes, this paper tries utilize corresponding audio, naturally contains extra information about same scene. Specifically, novel audio-infused (AIAIC) network proposed, consists three stages. First, we take color semantics as bridge pretrain guided by semantics. Second, natural co-occurrence audio video utilized learn correlations between visual scenes. Third, implicit representation fed into pretrained finally realize audio-guided The whole process trained in self-supervised manner without human annotation. In addition, audiovisual dataset established training testing. Experiments demonstrate that guidance can effectively improve performance colorization, especially some are understand only from modality.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products OPENALEX - Publications

PlumX Metrics

Audio-Infused Automatic Image Colorization by Exploiting Audio Scene Semantics

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....