欢迎访问中国科学院大学学报,今天是

中国科学院大学学报 ›› 2022, Vol. 39 ›› Issue (6): 836-844.DOI: 10.7523/j.ucas.2021.0002

• 电子信息与计算机科学 • 上一篇    下一篇

基于多专家和MDNet的视觉目标跟踪方法

张知明, 李国荣, 黄庆明   

  1. 中国科学院大学计算机科学与技术学院, 北京 100049
  • 收稿日期:2020-12-09 修回日期:2021-01-03 发布日期:2021-05-31
  • 通讯作者: 李国荣,E-mail:liguorong@ucas.ac.cn
  • 基金资助:
    国家自然科学基金(61931008,61772494)资助

Visual object tracking based on multiple experts and MDNet

ZHANG Zhiming, LI Guorong, HUANG Qingming   

  1. School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2020-12-09 Revised:2021-01-03 Published:2021-05-31

摘要: 近年来,随着深度学习技术的不断发展,基于深度学习的目标跟踪算法取得了较大成功。但由于视频中,背景、光照及目标的表观不断变化,且伴有遮挡的发生,给视频中的目标跟踪带来很大困难。传统方法主要通过在线更新跟踪器的方式解决这个问题。但是视频信息内容复杂多变,在线更新和维持一个跟踪器很难应对后续视频中复杂的数据,容易导致误差积累。为解决这个问题,基于已有跟踪器MDNet,提出一种基于多专家跟踪器的目标跟踪方法。首先通过MDNet学习所有视频中目标的共有特征,使其能够较好地描述目标。然后在跟踪过程中,根据跟踪结果动态地构建多个专家跟踪器,以增加跟踪器的鲁棒性。最后根据每个专家的评价函数选择最佳的专家跟踪器,用于跟踪当前帧中的目标。实验表明,与MDNet相比, 所提方法显著地提升了跟踪性能。

关键词: 视觉目标跟踪, 多专家, 多决策整合, MDNet

Abstract: In recent years, with the continuous development of deep learning technology, deep learning based visual object tracking algorithms have achieved great success. However, in the video, the background, illumination, and the appearance of the target are constantly changing, accompanied by the occurrence of occlusion. This brings great difficulties for visual object tracking. Most of the traditional methods tried to online update the tracker to adapt to the changes in the video. However, the content of the video is complex and changeable, and it is difficult to update and maintain one tracker online to deal with the complex data in the subsequent video, which can easily lead to the accumulation of errors. To solve this problem, based on the existing tracker MDNet, we propose a multi-expert tracker based tracing method. First, the common features of all targets in the video are learned through MDNet, so that the learned features can describe the target better. Then in the tracking process, multiple expert trackers are dynamically constructed according to the tracking results to increase the robustness of the trackers. Finally, the best expert tracker is selected according to the evaluation function of each expert and is used for tracking in the current frame. Experiments show that the proposed method achieves effective tracking results on 25 videos with abrupt changes. Compared with MDNet, the proposed method greatly improves the performance.

Key words: visual object tracking, multiple experts, multiple decisions fusion, MDNet

中图分类号: