欢迎访问中国科学院大学学报,今天是

中国科学院大学学报

• • 上一篇    下一篇

一种融合Transformer和UNet的森林覆盖信息提取方法*

廖凌岑1,2, 刘巍1†, 刘士彬1   

  1. 1 中国科学院空天信息创新研究院,北京,100094;
    2 中国科学院大学资源与环境学院,北京, 100049
  • 收稿日期:2023-03-20 修回日期:2023-05-06 发布日期:2023-06-12
  • 通讯作者: E-mail: liuwei202614@aircas.ac.cn
  • 基金资助:
    *中国科学院战略先导科技专项A类(XDA19010401)和国家重点研发计划政府间港澳台重点专项项目(2018YFE0100100)资助

A method to extract forest cover information by fusing Transformer and UNet

LIAO Lingcen1,2, LIU Wei1, LIU Shibin1   

  1. 1 Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094,China;
    2 College of Resources and Environment, University of Chinese Academy of Sciences, Beijing 100049,China
  • Received:2023-03-20 Revised:2023-05-06 Published:2023-06-12

摘要: 森林覆盖信息提取是森林遥感应用的重要内容之一,它对于森林资源管理、生态环境保护和气候变化研究等具有重要意义。传统的基于卷积神经网络的方法虽然能够有效地提取局部特征,但难以捕获远程依赖和全局上下文信息。为了解决这个问题,本研究提出了一种融合Transformer和UNet的森林覆盖信息提取方法,简称为DiUNet。该方法将Transformer模块嵌入到UNet网络中,以增强其对远程依赖和全局上下文信息的感知能力。同时,针对森林覆盖信息的破碎、无规则和尺度不一等特点,本方法通过利用相对位置编码增加位置信息,提升了模型对空间信息的捕获能力,使得模型能够在不同层次和尺度上捕获特征。本研究构建了一个基于Landsat 8 和 CDL 数据层的森林覆盖信息数据集,并对该数据集进行了深入实验分析。在对比实验中,DiUNet在精确度、召回率、F1分数、交并比、频权交并比等指标中取得最好的结果,分别为91.22%、92.66%、91.94%、85.08%、81.65%,同时在泛化实验中也取得了不错的结果。实验结果表明DiUNet方法在森林覆盖信息提取方面优于现有的方法,并且具有较高的鲁棒性和泛化性。

关键词: 语义分割, UNet, Transformer, 森林覆盖信息

Abstract: Forest cover information extraction is one of the essential tasks in forest remote sensing applications, which is of great significance for forest resource management, ecological environment protection, and climate change research. Traditional convolutional neural network-based methods can effectively extract local features, but they struggle to capture long-range dependencies and global context information. To address this issue, we propose a method for forest cover information extraction that fuses Transformer and UNet, referred to as DiUNet. This approach embeds Transformer modules into the UNet network to enhance its perception of long-range dependencies and global context information. Meanwhile, considering the fragmentation, irregularity, and inconsistent scale of forest cover information, our method enhances the model's ability to capture spatial information by using relative position encoding to increase the positional information, enabling the model to capture features at different levels and scales. We constructed a forest cover information dataset based on Landsat 8 and CDL data layers and conducted in-depth experimental analyses on this dataset. In the comparative experiments, DiUNet achieved the best results in accuracy, recall, F1 score, intersection-over-union, and frequency-weighted intersection-over-union indices, which were 91.22%, 92.66%, 91.94%, 85.08%, and 81.65%, respectively. The model also performed well in generalization experiments. The experimental results show that the DiUNet method outperforms existing methods in forest cover information extraction and has high robustness and generalization capabilities.

Key words: semantic segmentation, UNet, Transformer, forest cover information

中图分类号: