欢迎访问中国科学院大学学报,今天是

中国科学院大学学报 ›› 2018, Vol. 35 ›› Issue (4): 544-549.DOI: 10.7523/j.issn.2095-6134.2018.04.018

• 计算机科学 • 上一篇    下一篇

基于卷积神经网络的时空融合的无参考视频质量评价方法

王春峰1, 苏荔1,2, 黄庆明1,2   

  1. 1. 中国科学院大学大数据挖掘与知识管理重点实验室, 北京 100049;
    2. 中国科学院计算技术研究所智能信息处理重点实验室, 北京 100190
  • 收稿日期:2017-03-31 修回日期:2017-04-25 发布日期:2018-07-15
  • 通讯作者: 苏荔
  • 基金资助:
    国家自然科学基金(61650202,61472389,61332016,U1636214)资助

Spatio-temporal-fused no-reference video quality assessment based on convolutional neural network

WANG Chunfeng1, SU Li1,2, HUANG Qingming1,2   

  1. 1. Key Laboratory of Big Data Mining and Knowledge Management of CAS, University of Chinese Academy of Sciences, Beijing 100049, China;
    2. Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
  • Received:2017-03-31 Revised:2017-04-25 Published:2018-07-15

摘要: 无参考视频质量评价是指在不借助原始无损参考视频信息的条件下,对于给定的任意一段视频,直接评测出其质量程度。传统的无参考视频质量评价方法大都基于统计分析,绝大多数都针对特定的视频失真类型,对视频的时域信息考虑较少,导致现有的基于统计分析的方法应用范围局限,实时性较差。提出一种融合视频时空信息的基于卷积神经网络的无参考视频质量评价方法。该方法不针对特定失真类型。将方法分为空域和时域两部分进行处理,空域上提出一种基于卷积神经网络的方法学习空域失真特征,时域上设计一组基于邻帧块结构相似度的特征用以表征视频的时域失真信息。最后将视频的时空特征进行融合,送至线性回归模型进行视频质量的预测。实验表明,所提方法的多项指标均达到主流视频质量评价方法的性能,且方法运行速度大大提高,显示出较好的实时应用前景。

关键词: 视频质量评价, 卷积神经网络, 无参考, 时空信息

Abstract: No-reference video quality assessment (NR-VQA) measures distorted videos quantitatively without the reference of original distorted-less videos. Most conventional NR-VQA methods are based on statistical analysis, and the majority of them are generally designed for specific types of distortions or consider less about the temporal information, which limits their application scenarios as well as their speeds. In this paper, we propose a spatio-temporal no-reference video quality assessment method based on convolutional neural network, which is not designed for specific types of distortions. We divide the method into spatial and temporal processes. We redesign a convolutional neural network in spatiality to learn the distortion features in frames. A group of SSIM-like features are exploited in temporality. Finally, we train a linear regression model using the spatio-temporal features to predict the video quality. Experiments demonstrate that the proposed method is similar to other state-of-the-art no-reference VQA methods in performance. Fourthermore, the proposed method runs much faster than other VQA methods, which makes the proposed method have better application prospects.

Key words: video quality assessment, convolutional neural network, no-reference, spatio-temporal information

中图分类号: