欢迎访问中国科学院大学学报,今天是

中国科学院大学学报 ›› 2008, Vol. 25 ›› Issue (4): 445-451.DOI: 10.7523/j.issn.2095-6134.2008.4.003

• 论文 • 上一篇    下一篇

Web用法挖掘数据采集方案的优化设计

王 铮 张君玉   

  1. 中国科学院研究生院数学科学学院,北京100049
  • 收稿日期:1900-01-01 修回日期:1900-01-01 发布日期:2008-07-15

Optimization of data collecting solution for Web usage mining

Wang zheng, Zhang jun-yu   

  1. School of Mathematical Sciences, Graduate University, Chinese Academy of Sciences, Beijing 100049, China
  • Received:1900-01-01 Revised:1900-01-01 Published:2008-07-15

摘要: 从提高Web用法挖掘系统整体运行效率的角度出发,优化设计Web用法挖掘数据方案;通过细化采集工作,实施简化待采集信息元集合,扩展信息元标识功能,在信息抽象基础上对信息进行分类提交和存储,进行分布式数据预处理等策略,使得在高质量完成数据采集工作的基础上,系统的存储效率、性能平衡、解析与转储效率也得到明显提升。

关键词: Web用法挖掘, 数据采集, 优化设计, 网络广告

Abstract:

This paper focuses on optimization of data collecting solution for Web usage mining in order to improve operating efficiency of Web usage mining systems. Some strategies are introduced, like reducing redundancy, extending functions of elements to be collected, differentiating the submission based on elements classifying, advancing data preprocessing at client sides and so on. With these strategies, the efficiency of data storing, resources balancing, data analyzing and transferring are significantly improved.

Key words: Web usage mining, data collection, optimization, online ad