[1] Brendel W, Todorovic S. Learning spatiotemporal graphs of human activities[C]//International Conference on Computer Vision. Barcelona:IEEE, 2011:778-785. [2] Zheng D, Xiong H, Zheng Y F, et al. A structured learning-based graph matching for dynamic multiple object tracking[C]//International Conference on Image Processing. Brussels:IEEE, 2011:2333-2336. [3] Chu L, Jiang S, Wang S, et al. Robust spatial consistency graph model for partial duplicate image retrieval[J]. IEEE Transactions on Multimedia, 2013, 15(8):1982-1996. [4] Fischler M A, Bolles R C. Random sample consensus:a paradigm for model fitting with applications to image analysis and automated cartography[J]. Communications of the ACM, 1981, 24(6):381-395 [5] Bolles R C. Robust feature matching through maximal cliques[J]. Proceedings of SPIE-The International Society for Optical Engineering, 1979, 182:140-149. [6] Rumelhart D E, Hinton G E, Williams R J, et al. Learning representations by back-propagating errors[J]. Nature, 1988, 323(6088):696-699. [7] Hochreiter S, Schmidhuber J. Long short-term memory[J]. Neural Computation, 1997, 9(8):1735-1780. [8] Sutskever I, Vinyals O, Le Q V, et al. Sequence to sequence learning with neural networks[C]//Neural Information Processing Systems. Montreal:2014:3104-3112. [9] Vinyals O, Kaiser L, Koo T, et al. Grammar as a foreign language[C]//Neural Information Processing Systems. Montreal:2015:2773-2781. [10] Vinyals O, Toshev A, Bengio S, et al. Show and tell:a neural image caption generator[C]//Computer Vision and Pattern Recognition. Boston:IEEE, 2015:3156-3164. [11] Donahue J, Hendricks L A, Guadarrama S, et al. Long-term recurrent convolutional networks for visual recognition and description[C]//Computer Vision and Pattern Recognition. Boston:IEEE, 2015:2625-2634. [12] Vinyals O, Fortunato M, Jaitly N, et al. Pointer networks[C]//Neural Information Processing Systems. Montreal:2015:2692-2700. [13] Milan A, Rezatofighi S H, Garg R, et al. Data-driven approximations to NP-hard problems[C]//Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence. San Francisco:AAAI Press, 2017:1453-1459. [14] Bello I, Pham H, Le Q V, et al. Neural combinatorial optimization with reinforcement learning[J]. arXiv preprint arXiv:1611.09940, 2016. [15] Cho K, Van Merrienboer B, Gulcehre C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[C]//Empirical Methods in Natural Language Processing. Doha:Association for Computational Linguistics, 2014:1724-1734. [16] Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[J]. arXiv preprint arXiv:1409.0473, 2014. [17] Zhang M, Zhou Z. A review on multi-label learning algorithms[J]. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(8):1819-1837. [18] Tsoumakas G, Katakis I. Multi-label classification:an overview[J]. International Journal of Data Warehousing and Mining, 2007, 3(3):1-13. [19] Liu I, Ramakrishnan B. Bach in 2014:music composition with recurrent neural network[J]. arXiv preprint arXiv:1412.3191, 2014. [20] Yeung S, Russakovsky O, Jin N, et al. Every moment counts:dense detailed labeling of actions in complex videos[J]. International Journal of Computer Vision, 2018, 126(2-4):375-389. [21] Lipton Z C, Kale D C, Elkan C, et al. Learning to diagnose with LSTM recurrent neural networks[J]. arXiv preprint arXiv:1511.03677, 2015. [22] Schuster M, Paliwal K K. Bidirectional recurrent neural networks[J]. IEEE Transactions on Signal Processing, 1997, 45(11):2673-2681. [23] Vinyals O, Bengio S, Kudlur M. Order matters:sequence to sequence for sets[J]. arXiv preprint arXiv:1511.06391, 2015. [24] Abadi M, Barham P, Chen J, et al. Tensorflow:a system for large-scale machine learning[C]//Operating Systems Design and Implementation. Savannah:USENIX, 2016, 16:265-283. |