Welcome to Journal of University of Chinese Academy of Sciences,Today is

›› 2014, Vol. 31 ›› Issue (1): 124-129.DOI: 10.7523/j.issn.2095-6134.2014.01.018

• Research Articles • Previous Articles     Next Articles

An XBRL dimensional data parsing algorithm based on the Map/Reduce parallel programming model

ZHU Jianpeng, WANG Ying, YANG Cheng   

  1. College of Engineering and Information Technology, University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2013-04-26 Revised:2013-05-20 Online:2014-01-15

Abstract:

This article intends to study mass semi-structured data processing technology from XBRL dimensional data processing perspective. A new XBRL dimensional data parsing algorithm is proposed based on the Map/Reduce parallel programming model and StAX stream parsing technique. The algorithm specifically targets the analysis of complex data reference relationships among XML files in the XBRL financial report. In order to parse complex XBRL dimensional data, the algorithm uses a single XBRL financial report as the minimum processing unit. First, the data are extracted from the dimensional fact items, and then the business semantic data are processed. In experimental tests, the proposed algorithm presents an obvious advantage in large-scale XBRL data processing.

Key words: XBRL, semi-structured data processing, big data processing, Map/Reduce, XBRL dimension

CLC Number: