欢迎访问中国科学院大学学报,今天是

中国科学院大学学报 ›› 2009, Vol. 26 ›› Issue (4): 530-538.DOI: 10.7523/j.issn.2095-6134.2009.4.015

• 论文 • 上一篇    下一篇

SA-DBSCAN:一种自适应基于密度聚类算法

夏鲁宁, 荆继武   

  1. 中国科学院研究生院, 信息安全国家重点实验室, 京 100049
  • 收稿日期:2008-06-26 修回日期:2008-12-25 发布日期:2009-07-15
  • 基金资助:

    国家高技术研究发展计划(863)项目(2003AA144050)资助 

SA-DBSCAN:A self-adaptive density-based clustering algorithm

XIA Lu-Ning, JING Ji-Wu   

  1. State Key Laboratory of Information Security, Chinese Academy of Sciences, Beijing 100049,China
  • Received:2008-06-26 Revised:2008-12-25 Published:2009-07-15

摘要:

DBSCAN是一种经典的基于密度聚类算法,能够自动确定簇的数量,对任意形状的簇都能有效处理.DBSCAN算法需要人为确定Eps和minPts?2个参数,导致聚类过程需人工干预才能进行.在DBSCAN的基础上提出了SA-DBSCAN聚类算法,通过分析数据集统计特性来自动确定Eps和minPts参数,从而避免了聚类过程的人工干预,实现聚类过程的全自动化.实验表明,SA-DBSCAN能够选择合理的Eps和minPts参数并得到较高准确度的聚类结果.

关键词: 数据挖掘, 聚类, DBSCAN, SA-DBSCAN

Abstract:

DBSCAN is a classic density-based clustering algorithm. It can automatically determine the number of clusters and treat clusters of arbitrary shapes. In the clustering process of DBSCAN, two parameters, Eps and minPts,have to be specified by uses. In this paper an adaptive algorithm named SA-DBSCAN was introduced to determine the two parameters automatically via analysis of the statistical characteristics of the dataset, which enabled clustering process of DBSCAN fully automated. Experimental results indicate that SA-DBSCAN can select appropriate parameters and gain a rather high validity of clustering.

Key words: data mining, clustering, DBSCAN, SA-DBSCAN

中图分类号: