欢迎访问中国科学院大学学报,今天是

中国科学院大学学报

• • 上一篇    下一篇

基于多尺度语义先验的街景图像修复算法*

曾建顺1,3, 吕炎杰2, 覃驭楚2†   

  1. 1 中国科学院空天信息创新研究院,数字地球国家重点实验室,北京 100094;
    2 可持续发展大数据国际研究中心,北京 100094;
    3 中国科学院大学 电子电气与通信工程学院,北京 100049
  • 收稿日期:2023-09-21 修回日期:2023-11-30 发布日期:2023-12-12
  • 通讯作者: E-mail: qinyc@aircas.ac.cn
  • 基金资助:
    * 中国科学院“地球大数据科学工程”A类战略性先导科技专项(XDA19030102)资助

Multi-scale semantic prior features guided street-view image inpainting algorithm

ZENG Jianshun1,3, LV Yanjie2, QIN Yuchu2†   

  1. 1 Aerospace Information Research Institute, Chinese Academy of Sciences, Key Laboratory of Digital Earth Science, Chinese Academy of Sciences, Beijing 100094, China;
    2 International Research Center of Big Data for Sustainable Development Goals, Beijing 100094, China;
    3 School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2023-09-21 Revised:2023-11-30 Published:2023-12-12

摘要: 城市街景图像作为重要的空间数据形式,在地图服务、城市环境三维重建与制图等领域具有广泛的应用价值。由于采集的街景图像面临干扰目标遮挡和隐私安全等问题,因而需要进行精细的预处理。针对街景图像预处理中的问题,本文提出基于多尺度语义先验特征引导的街景图像修复算法,用于生成更加真实自然的静态街景图像。首先,本文构建了语义先验网络学习输入图像缺失区域的多尺度语义先验以增强上下文信息,语义增强生成网络利用先验转移模块自适应地融合多尺度语义先验和图像特征,同时引入多级注意力转移机制细化图像纹理信息;算法采用马尔可夫判别网络,通过对抗训练来区分生成图像与真实图像,使得重建后的街景图像具有更强的真实感。基于Apolloscape数据集的实验证明,该算法在图像语义结构连贯性和细节纹理等方面取得了显著提升,算法的提出解决街景图像中隐私问题的同时也为实景化城市应用提供更可靠的基础数据。

关键词: 城市街景, 图像修复, 城市实景化, 对抗生成网络, 移动目标去除

Abstract: Urban street view imagery, as crucial forms of spatial data, has a wide range of applications in mapping services, urban 3D reconstruction, and cartography. However, since the collected street view images often face challenges such as distracting target occlusion and privacy security, necessitating meticulous preprocessing. Addressing these challenges, we propose an image inpainting algorithm based on multi-scale semantic priori guided for generating more realistic and natural static street view images. Firstly, a semantic prior network is designed to learn the multi-scale semantic priors of the missing regions of the input image to enhance the contextual information. The semantic enhancement generator adaptively fuses the multi-scale semantic prior and image features and at the same time introduces a multilevel attention shifting mechanism to refine the texture information of the image. Finally, a Markov discriminator is adopted to distinguish the generated image from the real image by adversarial training, which makes the reconstructed street scene image more realistic. Experiments on the Apolloscape dataset demonstrate that the images generated by our algorithm have achieved significant improvements in semantic structural coherence and detailed texture, solving the privacy problem in street view while providing a more reliable data base for realistic city applications.

Key words: Street view image, image inpainting, realization of urban complex environment, generative adversarial network, deep learning, moving object removal

中图分类号: