Grid-based subspace clustering over data streams

Nam Hun Park, Won Suk Lee

Research output: Chapter in Book/Report/Conference proceedingConference contribution

25 Citations (Scopus)


A real-life data stream usually contains many dimensions and some dimensional values of its data elements may be missing. In order to effectively extract the on-going change of a data stream with respect to all the subsets of the dimensions of the data stream, a grid-based subspace clustering algorithm is proposed in this paper. Given an n-dimensional data stream, the on-going distribution statistics of data elements in each one-dimension data space is firstly monitored by a list of grid-cells called a sibling list. Once a dense grid-cell of a first-level sibling list becomes a dense unit grid-cell, new second-level sibling lists are created as its child nodes in order to trace any cluster in all possible two- dimensional rectangular subspaces. In such a way, a sibling tree grows up to the nth level at most and a l-dimensional subcluster can be found in the Kth level of the sibling tree. The proposed method is comparatively analyzed by a series of experiments to identify its various characteristics.

Original languageEnglish
Title of host publicationCIKM 2007 - Proceedings of the 16th ACM Conference on Information and Knowledge Management
Number of pages10
Publication statusPublished - 2007
Event16th ACM Conference on Information and Knowledge Management, CIKM 2007 - Lisboa, Portugal
Duration: 2007 Nov 62007 Nov 9

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings


Other16th ACM Conference on Information and Knowledge Management, CIKM 2007

All Science Journal Classification (ASJC) codes

  • General Decision Sciences
  • General Business,Management and Accounting


Dive into the research topics of 'Grid-based subspace clustering over data streams'. Together they form a unique fingerprint.

Cite this