A Hierarchical Index Structure for Region-Aware Spatial Keyword Search with Edit Distance Constraint

Yang, Junye; Zhang, Yong; Hu, Huiqi; Xing, Chunxiao

doi:10.1007/978-3-030-18579-4_35

Junye Yang²⁴,
Yong Zhang²⁴,
Huiqi Hu²⁵ &
…
Chunxiao Xing²⁴

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11447))

Included in the following conference series:

International Conference on Database Systems for Advanced Applications

2831 Accesses
2 Citations

Abstract

Location-based services have become widely available on a variety of devices. Due to the errors in user input as well as geo-textual databases, supporting error-tolerant spatio-textual search becomes an important problem in the field of spatial keyword search. Edit distance is the most widely used metrics to capture typographical errors. However, existing techniques for spatio-textual similarity query mainly focused on the set based textual relevance, but they cannot work well for edit distance due to the lack of filter power, which would involve larger overhead of computing edit distance. In this paper, we propose a novel framework to solve the region aware top-\(k\) similarity search problem with edit distance constraint. We first propose a hierarchical index structure to capture signatures of both spatial and textual relevance. We then utilize the prefix filter techniques to support top-\(k\) similarity search on the index. We further propose an estimation based method and a greedy search algorithm to make full use of the filter power of the hierarchical index. Experimental results on real world POI datasets show that our method outperforms state-of-the-art methods by up to two orders of magnitude.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The uniformly distribution used in the \(i^{th}\) time is \(U^i\) where \(i \in [1, K]\).
2.
http://www.openstreetmap.org/.

References

Broder, A.Z., Charikar, M., Frieze, A.M., Mitzenmacher, M.: Min-wise independent permutations (extended abstract). In: STOC, pp. 327–336 (1998)
Google Scholar
Chaudhuri, S., Ganti, V., Kaushik, R.: A primitive operator for similarity joins in data cleaning. In: ICDE, p. 5 (2006)
Google Scholar
Chen, L., Cong, G., Cao, X.: An efficient query indexing mechanism for filtering geo-textual data. In: SIGMOD, pp. 749–760 (2013)
Google Scholar
Chen, L., Cong, G., Jensen, C.S., Wu, D.: Spatial keyword query processing: an experimental evaluation. PVLDB 6(3), 217–228 (2013)
Google Scholar
Cong, G., Jensen, C.S.: Querying geo-textual data: spatial keyword queries and beyond. In: SIGMOD, pp. 2207–2212 (2016)
Google Scholar
Cong, G., Jensen, C.S., Wu, D.: Efficient retrieval of the top-k most relevant spatial web objects. PVLDB 2(1), 337–348 (2009)
Google Scholar
Felipe, I.D., Hristidis, V., Rishe, N.: Keyword search on spatial databases. In: ICDE, pp. 656–665 (2008)
Google Scholar
Gravano, L., Ipeirotis, P.G., Jagadish, H.V., Koudas, N., Muthukrishnan, S., Srivastava, D.: Approximate string joins in a database (almost) for free. In: VLDB, pp. 491–500 (2001)
Google Scholar
Li, C., Lu, J., Lu, Y.: Efficient merging and filtering algorithms for approximate string searches. In: ICDE, pp. 257–266 (2008)
Google Scholar
Li, G., Wang, Y., Wang, T., Feng, J.: Location-aware publish/subscribe. In: KDD, pp. 802–810 (2013)
Google Scholar
Li, Z., Lee, K.C.K., Zheng, B., Lee, W., Lee, D.L., Wang, X.: IR-tree: an efficient index for geographic document search. IEEE Trans. Knowl. Data Eng. 23(4), 585–599 (2011)
Article Google Scholar
Mazeika, A., Böhlen, M.H., Koudas, N., Srivastava, D.: Estimating the selectivity of approximate string queries. ACM Trans. Database Syst. 32(2), 12 (2007)
Article Google Scholar
Navarro, G.: A guided tour to approximate string matching. ACM Comput. Surv. 33(1), 31–88 (2001)
Article Google Scholar
Rocha-Junior, J.B., Gkorgkas, O., Jonassen, S., Nørvåg, K.: Efficient processing of top-k spatial keyword queries. In: SSTD, pp. 205–222 (2011)
Google Scholar
Wang, J., Li, G., Deng, D., Zhang, Y., Feng, J.: Two birds with one stone: an efficient hierarchical framework for top-k and threshold-based string similarity search. In: ICDE, pp. 519–530 (2015)
Google Scholar
Wang, J., Lin, C., Li, M., Zaniolo, C.: An efficient sliding window approach for approximate entity extraction with synonyms. In: EDBT (2019)
Google Scholar
Wang, X., Ding, X., Tung, A.K.H., Zhang, Z.: Efficient and effective KNN sequence search with approximate n-grams. PVLDB 7(1), 1–12 (2013)
Google Scholar
Wu, J., Zhang, Y., Wang, J., Lin, C., Fu, Y., Xing, C.: A scalable framework for metric similarity join using mapreduce. In: ICDE (2019)
Google Scholar
Yang, Z., Yu, J., Kitsuregawa, M.: Fast algorithms for top-k approximate string matching. In: AAAI (2010)
Google Scholar
Yao, B., Li, F., Hadjieleftheriou, M., Hou, K.: Approximate string search in spatial databases. In: ICDE, pp. 545–556 (2010)
Google Scholar
Zhang, C., Zhang, Y., Zhang, W., Lin, X.: Inverted linear quadtree: efficient top K spatial keyword search. In: ICDE, pp. 901–912 (2013)
Google Scholar
Zhang, D., Tan, K.L., Tung, A.K.H.: Scalable top-k spatial keyword search. In: EDBT, pp. 359–370 (2013)
Google Scholar
Zhang, Y., Li, X., Wang, J., Zhang, Y., Xing, C., Yuan, X.: An efficient framework for exact set similarity search using tree structure indexes. In: ICDE, pp. 759–770 (2017)
Google Scholar
Zhang, Y., Wu, J., Wang, J., Xing, C.: A transformation-based framework for KNN set similarity search. IEEE Trans. Knowl. Data Eng. (2019)
Google Scholar

Download references

Acknowledgement

This work was supported by NSFC (91646202), National Key R&D Program of China (SQ2018YFB140235), and the 1000-Talent program.

Author information

Authors and Affiliations

RIIT, TNList, Department of Computer Science and Technology, Tsinghua University, Beijing, China
Junye Yang, Yong Zhang & Chunxiao Xing
Institute for Data Science and Engineering, East China Normal University, Shanghai, China
Huiqi Hu

Authors

Junye Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Huiqi Hu
View author publications
You can also search for this author in PubMed Google Scholar
Chunxiao Xing
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yong Zhang .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Guoliang Li
Duke University, Durham, NC, USA
Jun Yang
University of Porto, Porto, Portugal
Joao Gama
Chiang Mai University, Chiang Mai, Thailand
Juggapong Natwichai
Beihang University, Beijing, China
Yongxin Tong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yang, J., Zhang, Y., Hu, H., Xing, C. (2019). A Hierarchical Index Structure for Region-Aware Spatial Keyword Search with Edit Distance Constraint. In: Li, G., Yang, J., Gama, J., Natwichai, J., Tong, Y. (eds) Database Systems for Advanced Applications. DASFAA 2019. Lecture Notes in Computer Science(), vol 11447. Springer, Cham. https://doi.org/10.1007/978-3-030-18579-4_35

Download citation

DOI: https://doi.org/10.1007/978-3-030-18579-4_35
Published: 24 April 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-18578-7
Online ISBN: 978-3-030-18579-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics