欢迎访问中国科学院大学学报,今天是

中国科学院大学学报 ›› 2009, Vol. 26 ›› Issue (4): 539-548.DOI: 10.7523/j.issn.2095-6134.2009.4.016

• 论文 • 上一篇    下一篇

一种基于密度最大值的聚类算法

王晶1,2, 夏鲁宁2, 荆继武2   

  1. 1. 中国科学技术大学电子工程与信息科学系, 合肥 230027;
    2. 中国科学院研究生院信息安全国家重点实验室, 北京 100049
  • 收稿日期:2008-10-08 修回日期:2009-01-09 发布日期:2009-07-15
  • 通讯作者: 王晶
  • 基金资助:

    国家863计划(2006AA01Z454)和电子信息产业发展基金资助 

Maximum density clustering algorithm

WANG Jing1,2, XIA Lu-Ning2, JING Ji-Wu2   

  1. 1. Department of Electronic Engineering and Information Science, University of Science and Technology of China, Hefei 230027, China;
    2. State Key Lab of Information Security, Graduate University of the Chinese Academy of Sciences, Beijing 100049, China
  • Received:2008-10-08 Revised:2009-01-09 Published:2009-07-15

摘要:

提出了一种结合了基于密度聚类思想的划分聚类方法——"密度最大值聚类算法(MDCA)",以最大密度对象作为起始点,通过考察最大密度对象所处空间区域的密度分布情况来划分基本簇,并合并基本簇获得最终的簇划分.实验表明,MDCA能够自动确定簇数量,并有效发现任意形状的簇,对于未知数据集的处理能力和聚类准确度都优于传统的基于划分聚类算法.

关键词: 数据挖掘, 聚类, 最大密度对象, k-means, DBSCAN

Abstract:

This paper proposes a new clustering algorithm named maximum density clustering algorithm(MDCA). In MDCA the concept of density is introduced to identify the count of clusters automatically.By selecting the densest object as the threshold, densities of those objects around the densest object are reviewed to decide the partition of basic blocks. Then the basic blocks are merged to form clusters of arbitrary shape. Experiments show that the ability and validity of MDCA in processing unknown datasets are all better than traditional partition-based clustering algorithms.

Key words: data mining, clustering algorithm, densest object, k-means, DBSCAN

中图分类号: