Efficient Similarity Search for Time Series Data Based on the Minimum Distance

Lee, Sangjun; Kwon, Dongseop; Lee, Sukho

doi:10.1007/3-540-47961-9_27

Sangjun Lee⁷,
Dongseop Kwon⁷ &
Sukho Lee⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2348))

Included in the following conference series:

International Conference on Advanced Information Systems Engineering

1010 Accesses
1 Citations

Abstract

We address the problem of efficient similarity search based on the minimum distance in large time series databases. Most of previous work is focused on similarity matching and retrieval of time series based on the Euclidean distance. However, as we demonstrate in this paper, the Euclidean distance has limitations as a similarity measurement. It is sensitive to the absolute offsets of time sequences, so two time sequences that have similar shapes but with different vertical positions may be classified as dissimilar. The minimum distance is a more suitable similarity measurement than the Euclidean distance in many applications, where the shape of time series is a major consideration. To support minimum distance queries, most of previous work has the preprocessing step of vertical shifting that normalizes each time sequence by its mean before indexing. In this paper, we propose a novel and fast indexing scheme, called the segmented mean variation indexing(SMV-indexing). Our indexing scheme can match time series of similar shapes without vertical shifting and guarantees no false dismissals. Several experiments are performed on real data(stock price movement) to measure the performance of our indexing scheme. Experiments show that the SMV-indexing is more efficient than the sequential scanning in performance.

This work was supported by the Brain Korea 21 Project in 2001

Download to read the full chapter text

Chapter PDF

TS-MIoU: A Time Series Similarity Metric Without Mapping

A Similarity-Based Method for Visual Search in Time Series Using Coulomb’s Law

Efficient Similarity Searches for Multivariate Time Series: A Hash-Based Approach

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Rakesh Agrawal, Tomasz Imielinski and Arun N. Swami: Database Mining: A Performance Perspective. IEEE TKDE, Special issue on Learning and Discovery in Knowledge-Based Databases 5-6(1993) 914–925
Google Scholar
Usama M. Fayyad, Gregory Piatetsky-Shapiroa and Padhraic Smyth: Knowledge Discovery and Data Mining: Towards a Unifying Framework. In Proc. of International Conference on Knowledge Discovery and Data Mining(1996) 82–88
Google Scholar
A. Guttman: R-trees: A Dynamic Index Structure for Spatial Searching. In Proc. of SIGMOD Conference on Management of Data(1984) 47–57
Google Scholar
N. Beckmann, H. P. Kriegel, R. Schneider and B. Seeger: The R*-tree: An Efficient and Robust Access Method for Points and Rectangles. In Proc. of SIGMOD Conference on Management of Data(1990) 322–331
Google Scholar
Rakesh Agrawal, Christos Faloutsos and Arun N. Swami: Efficient Similarity Search In Sequence Databases. In Proc. of International Conference on Foundations of Data Organization and Algorithms(1993) 69–84
Google Scholar
Christos Faloutsos, M. Ranganathan and Yannis. Manolopoulos: Fast Subsequence Matching in Time-Series Databases. In Proc. of SIGMOD Conference on Management of Data(1994) 419–429
Google Scholar
Dina Q. Goldin, Paris C. Kanellakis: On Similarity Queries for Time-Series Data: Constraint Specification and Implementation. In Proc. of International Conference on Principles and Practice of Constraint Programming(1995) 137–153
Google Scholar
Rakesh Agrawal, King-Ip Lin, Harpreet S. Sawhney and Kyuseok Shim: Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases. In Proc. of International Conference on Very Large Data Bases(1995) 490–501
Google Scholar
Chung-Sheng Li, Philip S. Yu and Vittorio Castelli: HierarchyScan: A Hierarchical Similarity Search Algorithm for Databases of Long Sequences. In Proc. of International Conference on Data Engineering(1996) 546–553
Google Scholar
Flip Korn, H. V. Jagadish and Christos Faloutsos: Efficiently Supporting Ad Hoc Queries in Large Datasets of Time Sequences. In Proc. of SIGMOD Conference on Management of Data(1997) 289–300
Google Scholar
Davood Rafiei, Alberto O. Mendelzon: Similarity-Based Queries for Time Series Data. In Proc. of SIGMOD Conference on Management of Data(1997) 13–25
Google Scholar
Tolga Bozkaya, Nasser Yazdani and Z. MeralOzsoyoglu: Matching and Indexing Sequences of Different Lengths. In Proc. of International Conference on Information and Knowledge Management(1997) 128–135
Google Scholar
Nasser Yazdani and Z. Meral Ozsoyoglu: Sequence Matching of Images. In Proc. of International Conference on Scientific and Statistical Database Management(1996) 53–62
Google Scholar
Gautam Das, Dimitrios Gunopulos and Heikki Mannila: Finding Similar Time Series. In Proc. of European Conference on Principles of Data Mining and Knowledge Discovery(1997) 88–100
Google Scholar
Bela Bollobas, Gautam Das, Dimitrios Gunopulos and Heikki Mannila: Time-Series Similarity Problems and Well-Separated Geometric Sets. In Proc. of Symposium on Computational Geometry(1997) 454–456
Google Scholar
Byoung-Kee Yi, H. V. Jagadish and Christos Faloutsos: Efficient Retrieval of Similar Time Sequences Under Time Warping. In Proc. of International Conference on Data Engineering(1998) 201–208
Google Scholar
Davood Rafiei and Alberto O. Mendelzon: Efficient Retrieval of Similar Time Sequences Using DFT. In Proc. of International Conference on Foundations of Data Organization and Algorithms(1998)
Google Scholar
Sze Kin Lam, Man Hon Wong: A Fast Projection Algorithm for Sequence Data Searching. Data and Knowledge Engineering 28-3(1998) 321–339
Article Google Scholar
Kelvin Kam Wing Chu, Sze Kin Lam and Man Hon Wong: An Efficient Hash-Based Algorithm for Sequence Data Searching. The Computer Journal 41-6(1998) 402–415
Article Google Scholar
Kelvin Kam Wing Chu, Man Hon Wong: Fast Time-Series Searching with Scaling and Shifting. In Proc. of Symposium on Principles of Database Systems(1999) 237–248
Google Scholar
Davood Rafiei: On Similarity-Based Queries for Time-Series Data. In Proc. of International Conference on Data Engineering(1999) 410–417
Google Scholar
Kin-pong Chan, Ada Wai-chee Fu: Efficient Time Series Matching by Wavelets. In Proc. of International Conference on Data Engineering(1999) 126–133
Google Scholar
Eamonn J. Keogh, Michael J. Pazzani: A Simple Dimensionality Reduction Technique for Fast Similarity Search in Large Time Series Databases. In Proc. of Pacific-Asia Conference on Knowledge Discovery and Data Mining(2000) 122–133
Google Scholar
Sanghyun Park, Wesley W. Chu, Jeehee Yoon and Chihcheng Hsu: Efficient Searches for Similar Subsequences of Different Lengths in Sequence Databases. In Proc. of International Conference on Data Engineering(2000) 23–32
Google Scholar
Chang-Shing Perng, Haixun Wang, Sylvia R. Zhang and D. Stott Parker: Landmarks: a New Model for Similarity-based Pattern Querying in Time Series Databases. In Proc. of International Conference on Data Engineering(2000) 33–42
Google Scholar
Byoung-Kee Yi, Christos Faloutsos: Fast Time Sequence Indexing for Arbitrary Lp Norms. In Proc. of International Conference on Very Large Data Bases(2000) 385–394
Google Scholar
Eamonn J. Keogh, Kaushik Chakrabarti, Sharad Mehrotra and Michael J. Pazzani: Locally Adaptive Dimensionality Reduction for Indexing Large Time Series Databases. In Proc. of SIGMOD Conference on Management of Data(2001) 151–162
Google Scholar
M. H. Protter and C. B. Morrey: A First Course in Real Analysis. Springer-Verlag(1977)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Electrical Engineering and Computer Science, Seoul National University, Seoul, 151-742, Korea
Sangjun Lee, Dongseop Kwon & Sukho Lee

Authors

Sangjun Lee
View author publications
You can also search for this author in PubMed Google Scholar
Dongseop Kwon
View author publications
You can also search for this author in PubMed Google Scholar
Sukho Lee
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Science, University of Waterloo, 200 University Avenue West, Waterloo, Ontario, N2L 3G1, Canada
Anne Banks Pidduck & M. Tamer Ozsu &
University of Toronto, Pratt Building 6 King’s College Road, Toronto, Ontario, M5S 3H5
John Mylopoulos
Faculty of Commerce and Business Administration, University of British Columbia, 2053 Main Mall, Vancouver, B.C., V6T 1Z2, Canada
Carson C. Woo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lee, S., Kwon, D., Lee, S. (2002). Efficient Similarity Search for Time Series Data Based on the Minimum Distance. In: Pidduck, A.B., Ozsu, M.T., Mylopoulos, J., Woo, C.C. (eds) Advanced Information Systems Engineering. CAiSE 2002. Lecture Notes in Computer Science, vol 2348. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47961-9_27

Download citation

DOI: https://doi.org/10.1007/3-540-47961-9_27
Published: 29 May 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43738-3
Online ISBN: 978-3-540-47961-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Efficient Similarity Search for Time Series Data Based on the Minimum Distance

Abstract

Chapter PDF

Similar content being viewed by others

TS-MIoU: A Time Series Similarity Metric Without Mapping

A Similarity-Based Method for Visual Search in Time Series Using Coulomb’s Law

Efficient Similarity Searches for Multivariate Time Series: A Hash-Based Approach

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Efficient Similarity Search for Time Series Data Based on the Minimum Distance

Abstract

Chapter PDF

Similar content being viewed by others

TS-MIoU: A Time Series Similarity Metric Without Mapping

A Similarity-Based Method for Visual Search in Time Series Using Coulomb’s Law

Efficient Similarity Searches for Multivariate Time Series: A Hash-Based Approach

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation