欢迎访问中国科学院大学学报,今天是

中国科学院大学学报 ›› 2024, Vol. 41 ›› Issue (6): 810-820.DOI: 10.7523/j.ucas.2023.008

• 电子信息与计算机科学 • 上一篇    下一篇

基于双帧融合的野外运动小目标检测网络

赵筱晗1,2, 张泽斌1, 李宝清1   

  1. 1. 中国科学院上海微系统与信息技术研究所微系统技术重点实验室, 上海 201800;
    2. 中国科学院大学, 北京 100049
  • 收稿日期:2022-11-02 修回日期:2023-02-08 发布日期:2023-03-21
  • 通讯作者: 李宝清,E-mail:sinoiot@mail.sim.ac.cn
  • 基金资助:
    中国科学院微系统与信息技术研究所微系统技术重点实验室基金(6142804220102)资助

Field dynamic small object detection network based on double frame fusion

ZHAO Xiaohan1,2, ZHANG Zebin1, LI Baoqing1   

  1. 1. Key Laboratory of Microsystem Technology, Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 201800, China;
    2. University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2022-11-02 Revised:2023-02-08 Published:2023-03-21

摘要: 由于野外监控传感系统中背景干扰较多、小目标像素点较少以及缺乏相关公开数据集等因素,在野外复杂环境中检测运动小目标仍然是国防军事应用中一个具有挑战性的问题。针对这一问题,提出一种基于YOLOv5改进的双帧融合目标检测网络(YOLO-DFNet)。首先,提出双帧融合模块用来处理骨干网络输出的相邻帧特征,通过计算通道及时间维度的注意力和空间注意力,提取运动特征;其次,在颈部网络与检测头之间设计一个时间梯形融合网络,关注不同大小感受野上的运动目标,改善大位移小目标的检测效果。在野外运动小目标数据集FMSOD上的实验结果表明:YOLO-DFNet在不同IoU上的平均精度比YOLOv5算法提高3.9个百分点,同时也优于TPH-YOLOv5、YOLOv7等其他目标检测网络。

关键词: 目标检测, 野外监控传感网, 运动小目标, 双帧融合, 时空注意力

Abstract: Detecting dynamic small objects in complex environments in the field remains a challenging problem for defense and military applications due to factors such as more background interference in the field surveillance sensing systems, fewer pixels of small targets, and the lack of relevant open datasets. In order to solve this problem, a YOLOv5-based object detection network with double frame feature fusion (YOLO-DFNet) is proposed. Firstly, a double frame feature fusion module(D-F fusion) is introduced to process the adjacent frame features from the backbone network, calculating attention in channel, time, and space dimensions successively, to extract motion features. Secondly, a temporal trapezoidal fusion network based on an attention mechanism(TTFN_AM) is designed between the neck network and the detection head to focus on dynamic objects within receptive fields of different sizes, thereby improving the detection effect of small objects with large displacement. The experimental results on field motion small object dataset (FMSOD) show that the mean average precision (mAP) on different IoUs of the proposed YOLO-DFNet is 3.9 percentage points higher than that of YOLOv5, and also outperforms other object detection models such as Tph-YOLOv5 and YOLOv7.

Key words: object detection, field monitoring sensor network, dynamic small object, double-frame feature fusion, spatial-temporal attention

中图分类号: