基于机器学习的点集匹配算法

doi:10.7523/j.issn.2095-6134.2020.04.003

中国科学院大学学报 ›› 2020, Vol. 37 ›› Issue (4): 450-457.DOI: 10.7523/j.issn.2095-6134.2020.04.003

基于机器学习的点集匹配算法

唐思琦, 韩丛英, 郭田德

中国科学院大学数学科学学院, 北京 100049;中国科学院大数据挖掘和知识管理重点实验室, 北京 100190

收稿日期:2018-11-24 修回日期:2019-01-15 发布日期:2020-07-15
通讯作者: 韩丛英
基金资助:
Supported by the Chinese National Natural Science Foundation (11731013, 11331012, 11571014)

Point matching algorithm based on machine learning method

TANG Siqi, HAN Congying, GUO Tiande

School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China;Key Laboratory of Big Data Mining and Knowledge Management of Chinese Academy of Sciences, Beijing 100190, China

Received:2018-11-24 Revised:2019-01-15 Published:2020-07-15
Supported by:
Supported by the Chinese National Natural Science Foundation (11731013, 11331012, 11571014)

摘要/Abstract

摘要： 点集匹配是计算机视觉和模式识别中的重要问题，在目标识别、医学图像配准、姿态估计等方面都得到广泛应用。提出基于机器学习的端对端模型——multi-pointer network（MPN）来解决点集匹配问题。该网络模型利用多标签分类的思想，改进pointer network。以前的模型只输出输入序列的一个元素，而MPN模型选择输入序列中的一组元素作为输出。首先，把点集匹配问题转换为序列问题。这样，网络的输入为顶点的坐标序列，输出为点对之间的对应关系。利用这种方式，可以解决相对于整个空间的平移变换和其他大幅度的刚性变换。实验结果表明，模型也可以被推广解决其他带结构的组合优化问题，如三角剖分等。

关键词: 多指向型网络, 点集匹配, 递归神经网络, 长短期记忆网络, 多标签分类

Abstract: Point matching is an important issue of computer vision and pattern recognition, and it is widely used in target recognition, medical image, pose estimation, etc. In this study, we propose a novel end-to-end model (multi-pointer network) based on machine learning method to solve this problem. We capitalize on the idea of multi-label classification to ameliorate the pointer network. Instead of outputting a member of input sequence, our model selects a set of input elements as output. Considering matching problem as a sequential manner, our model takes the coordinates of points as input and outputs correspondences directly. Using this new method, we can effectively solve the translation of the whole space and other large-scale rigid transformations. Furthermore, experiment results show that our model can be generalized to other combinatorial optimization problems in which the output is a subset of input, like Delaunay triangulation.

Key words: multi-pointer network, point matching, recurrent neural network (RNN), long short-term memory (LSTM) network, multi-label classification

中图分类号:

TP391.41

唐思琦, 韩丛英, 郭田德. 基于机器学习的点集匹配算法[J]. 中国科学院大学学报, 2020, 37(4): 450-457.

TANG Siqi, HAN Congying, GUO Tiande. Point matching algorithm based on machine learning method[J]. , 2020, 37(4): 450-457.

参考文献

[1] Brendel W, Todorovic S. Learning spatiotemporal graphs of human activities[C]//International Conference on Computer Vision. Barcelona:IEEE, 2011:778-785.
[2] Zheng D, Xiong H, Zheng Y F, et al. A structured learning-based graph matching for dynamic multiple object tracking[C]//International Conference on Image Processing. Brussels:IEEE, 2011:2333-2336.
[3] Chu L, Jiang S, Wang S, et al. Robust spatial consistency graph model for partial duplicate image retrieval[J]. IEEE Transactions on Multimedia, 2013, 15(8):1982-1996.
[4] Fischler M A, Bolles R C. Random sample consensus:a paradigm for model fitting with applications to image analysis and automated cartography[J]. Communications of the ACM, 1981, 24(6):381-395
[5] Bolles R C. Robust feature matching through maximal cliques[J]. Proceedings of SPIE-The International Society for Optical Engineering, 1979, 182:140-149.
[6] Rumelhart D E, Hinton G E, Williams R J, et al. Learning representations by back-propagating errors[J]. Nature, 1988, 323(6088):696-699.
[7] Hochreiter S, Schmidhuber J. Long short-term memory[J]. Neural Computation, 1997, 9(8):1735-1780.
[8] Sutskever I, Vinyals O, Le Q V, et al. Sequence to sequence learning with neural networks[C]//Neural Information Processing Systems. Montreal:2014:3104-3112.
[9] Vinyals O, Kaiser L, Koo T, et al. Grammar as a foreign language[C]//Neural Information Processing Systems. Montreal:2015:2773-2781.
[10] Vinyals O, Toshev A, Bengio S, et al. Show and tell:a neural image caption generator[C]//Computer Vision and Pattern Recognition. Boston:IEEE, 2015:3156-3164.
[11] Donahue J, Hendricks L A, Guadarrama S, et al. Long-term recurrent convolutional networks for visual recognition and description[C]//Computer Vision and Pattern Recognition. Boston:IEEE, 2015:2625-2634.
[12] Vinyals O, Fortunato M, Jaitly N, et al. Pointer networks[C]//Neural Information Processing Systems. Montreal:2015:2692-2700.
[13] Milan A, Rezatofighi S H, Garg R, et al. Data-driven approximations to NP-hard problems[C]//Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence. San Francisco:AAAI Press, 2017:1453-1459.
[14] Bello I, Pham H, Le Q V, et al. Neural combinatorial optimization with reinforcement learning[J]. arXiv preprint arXiv:1611.09940, 2016.
[15] Cho K, Van Merrienboer B, Gulcehre C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[C]//Empirical Methods in Natural Language Processing. Doha:Association for Computational Linguistics, 2014:1724-1734.
[16] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[J]. arXiv preprint arXiv:1409.0473, 2014.
[17] Zhang M, Zhou Z. A review on multi-label learning algorithms[J]. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(8):1819-1837.
[18] Tsoumakas G, Katakis I. Multi-label classification:an overview[J]. International Journal of Data Warehousing and Mining, 2007, 3(3):1-13.
[19] Liu I, Ramakrishnan B. Bach in 2014:music composition with recurrent neural network[J]. arXiv preprint arXiv:1412.3191, 2014.
[20] Yeung S, Russakovsky O, Jin N, et al. Every moment counts:dense detailed labeling of actions in complex videos[J]. International Journal of Computer Vision, 2018, 126(2-4):375-389.
[21] Lipton Z C, Kale D C, Elkan C, et al. Learning to diagnose with LSTM recurrent neural networks[J]. arXiv preprint arXiv:1511.03677, 2015.
[22] Schuster M, Paliwal K K. Bidirectional recurrent neural networks[J]. IEEE Transactions on Signal Processing, 1997, 45(11):2673-2681.
[23] Vinyals O, Bengio S, Kudlur M. Order matters:sequence to sequence for sets[J]. arXiv preprint arXiv:1511.06391, 2015.
[24] Abadi M, Barham P, Chen J, et al. Tensorflow:a system for large-scale machine learning[C]//Operating Systems Design and Implementation. Savannah:USENIX, 2016, 16:265-283.

基于机器学习的点集匹配算法

Point matching algorithm based on machine learning method

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

访问统计

联系我们

[1]	史达亨, 刘立刚, 周斌, 卜智勇. 跨时间迁移的多源无线信号指纹融合定位方法[J]. 中国科学院大学学报, 2021, 38(6): 817-824.
[2]	庄子俊, 袁晓兵, 裴俊, 王国辉, 刘建坡. 基于无监督表达学习的森林地貌特征建模及林火易发性评估[J]. 中国科学院大学学报, 0, (): 1-1.
[3]	马曌月, 肖俊, 王颖. 基于面提取的三维岩体点云孔洞检测与修复方法^*[J]. 中国科学院大学学报, 0, (): 64-64.
[4]	张韬, 肖俊, 王颖. 基于匹配点逐层过滤的岩体点云配准方法[J]. 中国科学院大学学报, 0, (): 21550-21550.
[5]	范蓓媛, 刘力行, 李秀锋, 陈德勇, 王文会, 王军波, 陈健. 一种可以实现稳定单细胞包裹的无进样器的微流控平台[J]. 中国科学院大学学报, 2020, 37(3): 336-344.
[6]	王飞鹏, 肖俊, 王颖, 王云标. 一种基于高斯曲率的ICP改进算法[J]. 中国科学院大学学报, 2019, 36(5): 702-708.
[7]	凌程, 耿修瑞, 杨炜暾, 赵永超. 一种改进的基于奇异值分解的亚像素级图像配准算法[J]. 中国科学院大学学报, 2019, 36(1): 101-108.
[8]	李国庆, 朱百明, 齐洪钢, 黄鑫, 尹洪胜. 基于Gaussian-Hermite矩和HVS的图像质量评价[J]. 中国科学院大学学报, 2017, 34(3): 389-394.
[9]	吴一全, 王凯. 基于SUSAN算子和角点判别因子的目标边缘检测[J]. 中国科学院大学学报, 2016, 33(1): 128-134.
[10]	吴春生, 冯才刚, 迟学斌. 基于细节特征点的掌纹比对算法及GPU加速[J]. 中国科学院大学学报, 2015, 32(4): 571-576.
[11]	宁忠磊, 王宏琦, 张正. 一种基于协方差矩阵的自动目标检测方法[J]. 中国科学院大学学报, 2010, 27(3): 370-375.
[12]	程龙, 郭立, 袁红星, 陈晓琳. 基于光场渲染的动态3D目标重构技术[J]. 中国科学院大学学报, 2009, 26(6): 781-788.
[13]	谢明鸿张亚飞付琨. 基于种子点增长的SAR图像海岸线自动提取算法[J]. 中国科学院大学学报, 2007, 24(1): 93-98.
[14]	蔡国雷, 杨鸿波, 邹谋炎. 一种基于活动围道的纹理图像分割方法[J]. 中国科学院大学学报, 2005, 22(5): 624-630.
[15]	闫镔, 王鹏, 李可, 郝晶, 吴义根, 谢千河, 支联合, 王崴, 鲁娜, 袁秀丽, 单保慈. 基于小波变换的PET图像分析(英文)[J]. 中国科学院大学学报, 2005, 22(4): 499-505.