Clustering of XML schemas for information integration

Tae Woo Rhim, Kyong H.O. Lee

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)


As a prerequisite for information integration, this paper presents an efficient method for clustering XML schemas. The proposed method first computes similarities among schemas. The similarity is defined by the size of the common structure between two schemas under the assumption that the schemas with less cost to be integrated are more similar. Specifically, we extract one-to-one matchings between paths with the largest number of corresponding elements. Finally, a hierarchical clustering method is applied to the values of similarity. Experimental results with many XML schemas show that the method has performed better compared with previous works in terms of the accuracy of clustering, the clustering rate, the quality of clustering, and the time complexity.

Original languageEnglish
Pages (from-to)3-13
Number of pages11
JournalJournal of Computer Information Systems
Issue number2
Publication statusPublished - 2005 Dec

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Education
  • Computer Networks and Communications


Dive into the research topics of 'Clustering of XML schemas for information integration'. Together they form a unique fingerprint.

Cite this