InterLCM: Low-Quality Images as Intermediate States of Latent Consistency Models for Effective Blind Face Restoration

FOS: Computer and information sciences Computer Vision and Pattern Recognition (cs.CV) Computer Science - Computer Vision and Pattern Recognition
DOI: 10.48550/arxiv.2502.02215 Publication Date: 2025-02-04
ABSTRACT
Diffusion priors have been used for blind face restoration (BFR) by fine-tuning diffusion models (DMs) on datasets to recover low-quality images. However, the naive application of DMs presents several key limitations. (i) The prior has inferior semantic consistency (e.g., ID, structure and color.), increasing difficulty optimizing BFR model; (ii) reliance hundreds denoising iterations, preventing effective cooperation with perceptual losses, which is crucial faithful restoration. Observing that latent model (LCM) learns noise-to-data mappings ODE-trajectory therefore shows more in subject identity, structural information color preservation, we propose InterLCM leverage LCM its superior efficiency counter above issues. Treating images as intermediate state LCM, achieves a balance between fidelity quality starting from earlier steps. also allows integration loss during training, leading improved quality, particularly real-world scenarios. To mitigate uncertainties, incorporates Visual Module extract visual features Spatial Encoder capture spatial details, enhancing restored Extensive experiments demonstrate outperforms existing approaches both synthetic while achieving faster inference speed.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()