欢迎访问中国科学院大学学报,今天是

中国科学院大学学报 ›› 2024, Vol. 41 ›› Issue (5): 705-714.DOI: 10.7523/j.ucas.2022.075

• 电子信息与计算机科学 • 上一篇    

基于GAN反演的无缝图像补全技术

喻永生, 罗铁坚   

  1. 中国科学院大学计算机科学与技术学院, 北京 101408
  • 收稿日期:2022-04-11 修回日期:2022-06-29 发布日期:2022-06-30
  • 通讯作者: 罗铁坚,E-mail:tjluo@ucas.ac.cn
  • 基金资助:
    中国科学院战略性先导专项(E0421104)资助

Seamless image completion via GAN inversion

YU Yongsheng, LUO Tiejian   

  1. School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 101408, China
  • Received:2022-04-11 Revised:2022-06-29 Published:2022-06-30

摘要: 图像补全技术广泛应用于对象消除、媒体编辑,旨在平滑地恢复受损图像。基于生成对抗网络(GAN)反演将预训练的GAN模型作为有效先验,以真实的合成材质填充缺失区域。然而,现有GAN反演方法忽视了图像补全是具有硬约束的生成任务,使拼接图像有颜色、语义的不连续问题。针对此问题设计新的双向感知生成器和预调制网络来无缝地补全图像,其中双向感知生成器充分利用扩展隐藏空间,帮助模型从数据表征层面感知输入图像的非缺失区域,预调制网络利用多尺度结构进一步为风格向量提供判别性更强的语义。在Places2和CelebA-HQ数据集上进行实验,结果表明该方法不仅搭建GAN反演和图像补全之间的桥梁,而且优于目前主流算法,在FID指标上降低49.2%。

关键词: 图像补全, 生成对抗网络, GAN反演, 深度学习, 对象消除

Abstract: Image completion is widely used in unwanted object removal and media editing, which aims to find a semantically consistent way to recover corrupted images. This paper is based on generative adversarial network (GAN) inversion, which leverages a pre-trained GAN model as an effective prior to filling in the missing regions with photo-realistic textures. However, existing GAN inversion methods ignore that image completion is a generative task with hard constraints, making final images have noticeable color and semantic discontinuity issues. This paper designs a novel bi-directional perceptual generator and pre-modulation network to seamlessly fill in the images. The bi-directional perceptual generator uses extended latent space to help the model perceive the non-missing regions of the input images in terms of data representations. The pre-modulated networks utilize a multiscale structure further providing more discriminative semantics for the style vectors. In this paper, experiments are conducted on Places2 and CelebA-HQ datasets to verify that the proposed method builds a bridge between GAN inversion and image completion and outperforms current mainstream algorithms, especially in FID metrics up to 49.2% enhancement at most.

Key words: image completion, generative adversarial network, GAN inversion, deep learning, unwanted object removal

中图分类号: