Segment-based approach for subsequence searches in sequence databases

Sanghyun Park, Sang Wook Kim, Wesley W. Chu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

79 Citations (Scopus)


This paper investigates the subsequence searching problem under time warping in sequence databases. Time warping enables to find sequences with similar changing patterns even when they are of different lengths. Our work is motivated by the observation that subsequence searches slow down quadratically as the total length of data sequences increases. To resolve this problem, we propose the Segment-Based Approach for Subsequence Searches (SBASS), which modifies the similarity measure from time warping to piece-wise time warping and limits the number of possible subsequences to be compared with a query sequence. For efficient retrieval of similar subsequences, we extract feature vectors from all data segments exploiting their mono-tonically changing properties, and build a multi-dimensional index such as R-tree or R∗-tree. Using this index, queries are processed with four steps: 1) index filtering, 2) feature filtering, 3) successor filtering, and 4) post-processing. The effectiveness of our approach is verified through experiments on synthetic data sets.

Original languageEnglish
Title of host publicationProceedings of the 2001 ACM Symposium on Applied Computing, SAC 2001
PublisherAssociation for Computing Machinery
Number of pages5
ISBN (Print)1581132875, 9781581132878
Publication statusPublished - 2001 Mar 1
Event2001 ACM Symposium on Applied Computing, SAC 2001 - Las Vegas, United States
Duration: 2001 Mar 112001 Mar 14

Publication series

NameProceedings of the ACM Symposium on Applied Computing


Other2001 ACM Symposium on Applied Computing, SAC 2001
Country/TerritoryUnited States
CityLas Vegas

Bibliographical note

Publisher Copyright:
© 2001 ACM.

All Science Journal Classification (ASJC) codes

  • Software


Dive into the research topics of 'Segment-based approach for subsequence searches in sequence databases'. Together they form a unique fingerprint.

Cite this