Welcome to Journal of University of Chinese Academy of Sciences,Today is

›› 2016, Vol. 33 ›› Issue (5): 711-719.DOI: 10.7523/j.issn.2095-6134.2016.05.020

• Brief Report • Previous Articles    

LSI-based semantic retrieval model for scientific data in solar-terrestrial space field

LIU Chunwei1,2, ZOU Ziming1, TONG Jizhou1   

  1. 1 National Space Science Center, Chinese Academy of Sciences, Beijing 100190, China;
    2 University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2016-01-07 Revised:2016-04-01 Online:2016-09-15

Abstract:

The scientific data of solar-terrestrial space science has huge volume, wide variety, and complex structure. The correlations between different domain concepts and astro-events put forward high requirements of the scientific data retrieval in this field. However, the scientific data retrieval modules on the mainstream data share and publishing systems in this field are still built on the conventional keyword-based retrieval method. We present a semantic retrieval approach for the solar-terrestrial space system scientific data. Based on the semantic information extracted from scientific metadata of each scientific dataset, we get the TF-idf matrix using traditional text processing methods. Then latent semantic indexing further analyzes this matrix, and a similarity value is obtained to rank the relevance of a result to its search request. The experimental results show that the approach has a higher recall rate than conventional methods and maintains a high precision. This approach can be applied in other disciplines as well.

Key words: solar-terrestrial space, scientific data, semantic retrieval, LSI, metadata

CLC Number: