Skip to main content

A Hierarchical Index Structure for Region-Aware Spatial Keyword Search with Edit Distance Constraint

  • Conference paper
  • First Online:
Database Systems for Advanced Applications (DASFAA 2019)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11447))

Included in the following conference series:

Abstract

Location-based services have become widely available on a variety of devices. Due to the errors in user input as well as geo-textual databases, supporting error-tolerant spatio-textual search becomes an important problem in the field of spatial keyword search. Edit distance is the most widely used metrics to capture typographical errors. However, existing techniques for spatio-textual similarity query mainly focused on the set based textual relevance, but they cannot work well for edit distance due to the lack of filter power, which would involve larger overhead of computing edit distance. In this paper, we propose a novel framework to solve the region aware top-\(k\) similarity search problem with edit distance constraint. We first propose a hierarchical index structure to capture signatures of both spatial and textual relevance. We then utilize the prefix filter techniques to support top-\(k\) similarity search on the index. We further propose an estimation based method and a greedy search algorithm to make full use of the filter power of the hierarchical index. Experimental results on real world POI datasets show that our method outperforms state-of-the-art methods by up to two orders of magnitude.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The uniformly distribution used in the \(i^{th}\) time is \(U^i\) where \(i \in [1, K]\).

  2. 2.

    http://www.openstreetmap.org/.

References

  1. Broder, A.Z., Charikar, M., Frieze, A.M., Mitzenmacher, M.: Min-wise independent permutations (extended abstract). In: STOC, pp. 327–336 (1998)

    Google Scholar 

  2. Chaudhuri, S., Ganti, V., Kaushik, R.: A primitive operator for similarity joins in data cleaning. In: ICDE, p. 5 (2006)

    Google Scholar 

  3. Chen, L., Cong, G., Cao, X.: An efficient query indexing mechanism for filtering geo-textual data. In: SIGMOD, pp. 749–760 (2013)

    Google Scholar 

  4. Chen, L., Cong, G., Jensen, C.S., Wu, D.: Spatial keyword query processing: an experimental evaluation. PVLDB 6(3), 217–228 (2013)

    Google Scholar 

  5. Cong, G., Jensen, C.S.: Querying geo-textual data: spatial keyword queries and beyond. In: SIGMOD, pp. 2207–2212 (2016)

    Google Scholar 

  6. Cong, G., Jensen, C.S., Wu, D.: Efficient retrieval of the top-k most relevant spatial web objects. PVLDB 2(1), 337–348 (2009)

    Google Scholar 

  7. Felipe, I.D., Hristidis, V., Rishe, N.: Keyword search on spatial databases. In: ICDE, pp. 656–665 (2008)

    Google Scholar 

  8. Gravano, L., Ipeirotis, P.G., Jagadish, H.V., Koudas, N., Muthukrishnan, S., Srivastava, D.: Approximate string joins in a database (almost) for free. In: VLDB, pp. 491–500 (2001)

    Google Scholar 

  9. Li, C., Lu, J., Lu, Y.: Efficient merging and filtering algorithms for approximate string searches. In: ICDE, pp. 257–266 (2008)

    Google Scholar 

  10. Li, G., Wang, Y., Wang, T., Feng, J.: Location-aware publish/subscribe. In: KDD, pp. 802–810 (2013)

    Google Scholar 

  11. Li, Z., Lee, K.C.K., Zheng, B., Lee, W., Lee, D.L., Wang, X.: IR-tree: an efficient index for geographic document search. IEEE Trans. Knowl. Data Eng. 23(4), 585–599 (2011)

    Article  Google Scholar 

  12. Mazeika, A., Böhlen, M.H., Koudas, N., Srivastava, D.: Estimating the selectivity of approximate string queries. ACM Trans. Database Syst. 32(2), 12 (2007)

    Article  Google Scholar 

  13. Navarro, G.: A guided tour to approximate string matching. ACM Comput. Surv. 33(1), 31–88 (2001)

    Article  Google Scholar 

  14. Rocha-Junior, J.B., Gkorgkas, O., Jonassen, S., Nørvåg, K.: Efficient processing of top-k spatial keyword queries. In: SSTD, pp. 205–222 (2011)

    Google Scholar 

  15. Wang, J., Li, G., Deng, D., Zhang, Y., Feng, J.: Two birds with one stone: an efficient hierarchical framework for top-k and threshold-based string similarity search. In: ICDE, pp. 519–530 (2015)

    Google Scholar 

  16. Wang, J., Lin, C., Li, M., Zaniolo, C.: An efficient sliding window approach for approximate entity extraction with synonyms. In: EDBT (2019)

    Google Scholar 

  17. Wang, X., Ding, X., Tung, A.K.H., Zhang, Z.: Efficient and effective KNN sequence search with approximate n-grams. PVLDB 7(1), 1–12 (2013)

    Google Scholar 

  18. Wu, J., Zhang, Y., Wang, J., Lin, C., Fu, Y., Xing, C.: A scalable framework for metric similarity join using mapreduce. In: ICDE (2019)

    Google Scholar 

  19. Yang, Z., Yu, J., Kitsuregawa, M.: Fast algorithms for top-k approximate string matching. In: AAAI (2010)

    Google Scholar 

  20. Yao, B., Li, F., Hadjieleftheriou, M., Hou, K.: Approximate string search in spatial databases. In: ICDE, pp. 545–556 (2010)

    Google Scholar 

  21. Zhang, C., Zhang, Y., Zhang, W., Lin, X.: Inverted linear quadtree: efficient top K spatial keyword search. In: ICDE, pp. 901–912 (2013)

    Google Scholar 

  22. Zhang, D., Tan, K.L., Tung, A.K.H.: Scalable top-k spatial keyword search. In: EDBT, pp. 359–370 (2013)

    Google Scholar 

  23. Zhang, Y., Li, X., Wang, J., Zhang, Y., Xing, C., Yuan, X.: An efficient framework for exact set similarity search using tree structure indexes. In: ICDE, pp. 759–770 (2017)

    Google Scholar 

  24. Zhang, Y., Wu, J., Wang, J., Xing, C.: A transformation-based framework for KNN set similarity search. IEEE Trans. Knowl. Data Eng. (2019)

    Google Scholar 

Download references

Acknowledgement

This work was supported by NSFC (91646202), National Key R&D Program of China (SQ2018YFB140235), and the 1000-Talent program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yong Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yang, J., Zhang, Y., Hu, H., Xing, C. (2019). A Hierarchical Index Structure for Region-Aware Spatial Keyword Search with Edit Distance Constraint. In: Li, G., Yang, J., Gama, J., Natwichai, J., Tong, Y. (eds) Database Systems for Advanced Applications. DASFAA 2019. Lecture Notes in Computer Science(), vol 11447. Springer, Cham. https://doi.org/10.1007/978-3-030-18579-4_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-18579-4_35

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-18578-7

  • Online ISBN: 978-3-030-18579-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics