欢迎访问中国科学院大学学报,今天是

中国科学院大学学报

• • 上一篇    下一篇

基于双帧融合的野外运动小目标检测网络*

赵筱晗1,2, 张泽斌1, 李宝清1†   

  1. 1 中国科学院上海微系统与信息技术研究所微系统技术重点实验室,上海 201800;
    2 中国科学院大学,北京 100049
  • 收稿日期:2022-11-02 修回日期:2023-02-08 发布日期:2023-03-21
  • 通讯作者: Email: sinoiot@mail.sim.ac.c
  • 基金资助:
    *中国科学院微系统与信息技术研究所微系统技术重点实验室基金项目(6142804220102)资助

Field Dynamic Small Object Detection Network based on Double Frame Fusion

ZHAO Xiaohan1,2, ZHANG Zebin1, LI Baoqing1   

  1. 1 Key Laboratory of Microsystem Technology, Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 201800, China;
    2 University of Chinese Academy of Sciences, Beijing 100049,China
  • Received:2022-11-02 Revised:2023-02-08 Published:2023-03-21

摘要: 由于野外监控传感系统中背景干扰较多、小目标像素点较少以及缺乏相关公开数据集等因素,在野外复杂环境中检测运动小目标仍然是国防军事应用中一个具有挑战性的问题。针对这个问题,本文提出了一种基于YOLOv5改进的双帧融合目标检测网络(YOLO-DFNet)。首先,提出了双帧融合模块(double-frame fusion, D-F fusion)用来处理骨干网络输出的相邻帧特征,先后计算通道及时间维度的注意力和空间注意力,提取运动特征;其次,在颈部网络与检测头之间设计了一个时间梯形融合网络(temporal trapezoidal fusion network based on attention mechanism , TTFN_AM),关注不同大小感受野上的运动目标,改善大位移小目标的检测效果。在野外运动小目标数据集(field motion small object dataset, FMSOD)上的实验结果表明:本文提出的YOLO-DFNet在不同IoU上的平均精度(mAP)比YOLOv5算法提高了3.9个百分点,同时也优于TPH-YOLOv5、YOLOv7等其他目标检测网络。

关键词: 目标检测, 野外监控传感网, 运动小目标, 双帧融合, 时空注意力

Abstract: Detecting dynamic small objects in complex environments in the field remains a challenging problem for defense military applications due to factors such as more background interference in the field surveillance sensing systems, fewer pixels of small targets, and the lack of relevant open datasets. In order to solve this problem, a YOLOv5-based object detection network with double frame feature fusion (YOLO-DFNet) is proposed. Firstly, a double frame feature fusion module(D-F fusion) is proposed to process the adjacent frame features from the backbone network, calculating attention in channel and time as well as space successively, to extract motion features; secondly, a temporal trapezoidal fusion network based on attention mechanism(TTFN_AM) is designed between the neck network and the detection head to focus on dynamic objects on receptive fields of different sizes to improve the detection effect of small objects of large displacement. The experimental results on field motion small object dataset (FMSOD) show that the mean average precision (mAP) on different IoUs of the YOLO-DFNet proposed in this paper is 3.9% higher than the YOLOv5, and also outperforms other object detection models such as Tph-YOLOv5 and YOLOv7.

Key words: object detection, field monitoring sensor network, dynamic small object, double-frame feature fusion, spatial-temporal attention

中图分类号: