Welcome to Journal of University of Chinese Academy of Sciences,Today is

Journal of University of Chinese Academy of Sciences ›› 2022, Vol. 39 ›› Issue (3): 360-368.DOI: 10.7523/j.ucas.2020.0019

• Research Articles • Previous Articles     Next Articles

Name disambiguation based on encoding attributes and graph topology

MA Yingying1,2,3, WU Youlong1, TANG Hua1,2,3   

  1. 1 School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China;
    2 Shanghai Institute of Microsystem & Information Technology, Chinese Academy of Sciences, Shanghai 200050, China;
    3 University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2020-02-17 Revised:2020-04-03

Abstract: Aiming at solving the problem of author name ambiguity, we propose a novel name disambiguation method based on encoding attributes and graph topology. A word2vec model is used to construct document representation vectors by encoding the attributes of documents. The relationship of documents is then encoded into the document embedding vectors by a graph auto-encoder and similar documents are aggregated. To further improve the accuracy of the clustering results, a graph embedding model is proposed to introduce the document-document network and author-author network topology into the document vectors afterword, thus related papers are moved closer. This method utilizes the information of document attributes and relationship networks at the same time, finds document representation vectors using an unsupervised model and improves the performance of name disambiguation. Experimental results on the real author dataset AMiner show that our method is superior to several state-of-the-art graph-based solutions.

Key words: name disambiguation, graph neural network, clustering method, feature extraction, graph embedding

CLC Number: