Skip to main content

Tree-Based Graph Indexing forĀ Fast kNN Queries

  • Conference paper
  • First Online:
Information Integration and Web Intelligence (iiWAS 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13635))

Included in the following conference series:

  • 612 Accesses

Abstract

The k nearest neighbor (kNN) query on a graph is a problem to find k nodes having a shortest path distance from a user-specified query node in the graph. Graph indexing methods have the potential to achieve fast kNN queries and thus are promising approaches to handle large-scale graphs. However, those indexing approaches struggle to query kNN nodes on large-scale complex networks. This is because that complex networks generally have multiple shortest paths between specific two nodes, which incur redundant search costs in the indexing approaches. In this paper, we propose a novel graph indexing algorithm for fast kNN queries on complex networks. To overcome the aforementioned limitations, our algorithm generates a tree-based index from a graph so that it avoids to compute redundant paths during kNN queries. Our extensive experimental analysis on real-world graphs show that our algorithm achieves up to 146 times faster kNN queries than the state-of-the-art methods.

This work is partly supported by JST Presto JPMJPR2033, Japan.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abeywickrama, T., Cheema, M.A.: Efficient landmark-based candidate generation for kNN queries on road networks. In: Proceedings of the 22nd International Conference on Database Systems for Advanced Applications (DASFAA 2017), pp. 425ā€“440 (2017)

    Google ScholarĀ 

  2. Alom, Z., Carminati, B., Ferrari, E.: Detecting spam accounts on twitter. In: 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 1191ā€“1198 (2018). https://doi.org/10.1109/ASONAM.2018.8508495

  3. Bast, H., Funke, S., Matijevic, D.: Ultrafast shortest-path queries via transit nodes. In: Demetrescu, C., Goldberg, A.V., Johnson, D.S. (eds.) The Shortest Path Problem, pp. 175ā€“392. AMS (2006)

    Google ScholarĀ 

  4. Chen, J.-S., Huang, H.-Y., Hsu, C.-Y.: A kNN based position prediction method for SNS places. In: Nguyen, N.T., Jearanaitanakij, K., Selamat, A., Trawiński, B., Chittayasothorn, S. (eds.) ACIIDS 2020. LNCS (LNAI), vol. 12034, pp. 266ā€“273. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-42058-1_22

    ChapterĀ  Google ScholarĀ 

  5. Chen, Z., Li, P., Xiao, J., Nie, L., Liu, Y.: An order dispatch system based on reinforcement learning for ride sharing services. In: 2020 IEEE 22nd International Conference on High Performance Computing and Communications; IEEE 18th International Conference on Smart City; IEEE 6th International Conference on Data Science and Systems (HPCC/SmartCity/DSS), pp. 758ā€“763 (2020). https://doi.org/10.1109/HPCC-SmartCity-DSS50907.2020.00099

  6. Demetrescu, C.: The 9th DIMACS Implementation Challenge (June 2010). http://users.diag.uniroma1.it/challenge9/download.shtml

  7. Geisberger, R., Sanders, P., Schultes, D., Delling, D.: Contraction hierarchies: faster and simpler hierarchical routing in road networks. In: McGeoch, C.C. (ed.) WEA 2008. LNCS, vol. 5038, pp. 319ā€“333. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-68552-4_24

    ChapterĀ  Google ScholarĀ 

  8. Jing, N., Huang, Y.W., Rundensteiner, E.A.: Hierarchical encoded path views for path query processing: an optimal model and its performance evaluation. IEEE Trans. Knowl. Data Eng. 10(3), 409ā€“432 (1998)

    ArticleĀ  Google ScholarĀ 

  9. Jung, S., Pramanik, S.: An efficient path computation model for hierarchically structured topographical road maps. IEEE Trans. Knowl. Data Eng. 14(5), 1029ā€“1046 (2002)

    ArticleĀ  Google ScholarĀ 

  10. Karypis, G., Kumar, V.: Analysis of Multilevel Graph Partitioning. In: Proceedings of the IEEE/ACM SC95 Conference (SC 1995), pp. 29-es (1995)

    Google ScholarĀ 

  11. Kesarwani, A., Chauhan, S.S., Nair, A.R.: fake news detection on social media using k-nearest neighbor classifier. In: 2020 International Conference on Advances in Computing and Communication Engineering (ICACCE), pp. 1ā€“4 (2020). https://doi.org/10.1109/ICACCE49060.2020.9154997

  12. Kobayashi, S., Matsugu, S., Shiokawa, H.: Fast indexing algorithm for efficient k NN queries on complex networks. In: Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pp. 343ā€“347 (2021)

    Google ScholarĀ 

  13. Komamizu, T., Amagasa, T., Shaikh, S.A., Shiokawa, H., Kitagawa, H.: Towards real-time analysis of smart city data: A case study on city facility utilizations. In: 2016 IEEE 18th International Conference on High Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS), pp. 1357ā€“1364 (2016)

    Google ScholarĀ 

  14. Lee, K.C.K., Lee, W., Zheng, B., Tian, Y.: ROAD: A new spatial object search framework for road networks. IEEE Trans. Knowl. Data Eng. 3, 545ā€“560 (2012)

    Google ScholarĀ 

  15. Leskovec, J., Krevl, A.: SNAP Datasets: Stanford Large Network Dataset Collection (June 2014). http://snap.stanford.edu/data

  16. Li, H., Zhang, Q., Lu, K.: Integrating mobile sensing and social network for personalized health-care application. In: Proceedings of the 30th Annual ACM Symposium on Applied Computing, SAC 2015, pp. 527ā€“534. Association for Computing Machinery, New York, NY, USA (2015). https://doi.org/10.1145/2695664.2695767,https://doi.org/10.1145/2695664.2695767

  17. Li, Z., Chen, L., Wang, Y.: G*-Tree: An Efficient Spatial Index on Road Networks. In: Proceedings of the 35th IEEE International Conference on Data Engineering (ICDE 2019), pp. 268ā€“279 (2019)

    Google ScholarĀ 

  18. Mei, S., Li, H., Fan, J., Zhu, X., Dyer, C.R.: Inferring air pollution by sniffing social media. In: 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014), pp. 534ā€“539 (2014). https://doi.org/10.1109/ASONAM.2014.6921638

  19. Ni, M., Li, T., Li, Q., Zhang, H., Ye, Y.: FindMal: a file-to-file social network based malware detection framework. Knowl. Based Syst. 112, 142ā€“151 (2016). https://doi.org/10.1016/j.knosys.2016.09.004,https://www.sciencedirect.com/science/article/pii/S0950705116303215

  20. Prim, R.C.: Shortest connection networks and some generalizations. Bell Syst. Tech. J. 36(6), 1389ā€“1401 (1957)

    ArticleĀ  Google ScholarĀ 

  21. Samet, H., Sankaranarayanan, J., Alborzi, H.: Scalable network distance browsing in spatial databases. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data (SIGMOD), p. 43ā€“54 (2008)

    Google ScholarĀ 

  22. Sankaranarayanan, J., Samet, H., Alborzi, H.: Path oracles for spatial networks. Proc. VLDB Endow. 2(1), 1210ā€“1221 (2009)

    ArticleĀ  Google ScholarĀ 

  23. Shiokawa, H.: Fast ObjectRank for large knowledge databases. In: Proceedings of the 20th International Semantic Web Conference (ISWC 2021) (2021)

    Google ScholarĀ 

  24. Shiokawa, H.: Scalable affinity propagation for massive datasets. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2021), vol. 35, 9639ā€“9646, May 2021

    Google ScholarĀ 

  25. Shiokawa, H., Amagasa, T., Kitagawa, H.: Scaling fine-grained modularity clustering for massive graphs. In: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI 2019), pp. 4597ā€“4604, July 2019

    Google ScholarĀ 

  26. Shiokawa, H., Fujiwara, Y., Onizuka, M.: SCAN++: efficient algorithm for finding clusters, hubs and outliers on large-scale graphs. Proc. VLDB 8(11), 1178ā€“1189 (2015)

    ArticleĀ  Google ScholarĀ 

  27. Shiokawa, H., Takahashi, T.: DSCAN: distributed structural graph clustering for billion-edge graphs. In: Database and Expert Systems Applications: 31st International Conference, DEXA 2020, Bratislava, Slovakia, 14ā€“17 September 2020, Proceedings, Part I, pp. 38ā€“54 (2020)

    Google ScholarĀ 

  28. Kobayashi, S., Matsugu, H.S.: Indexing complex networks for fast attributed kNN queries. Soc. Netw. Anal. Mining 12(82) (2022)

    Google ScholarĀ 

  29. Suzuki, Y., Sato, M., Shiokawa, H., Yanagisawa, M., Kitagawa, H.: Masc: automatic sleep stage classification based on brain and myoelectric signals. In: 2017 IEEE 33rd International Conference on Data Engineering (ICDE), pp. 1489ā€“1496 (2017). https://doi.org/10.1109/ICDE.2017.218

  30. Zhong, R., Li, G., Tan, K.L., Zhou, L., Gong, Z.: G-Tree: an efficient and Scalable Index for spatial search on road networks. IEEE Trans. Knowl. Data Eng. 27(8), 2175ā€“2189 (2015)

    ArticleĀ  Google ScholarĀ 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Suomi Kobayashi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

Ā© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kobayashi, S., Matsugu, S., Shiokawa, H. (2022). Tree-Based Graph Indexing forĀ Fast kNN Queries. In: Pardede, E., Delir Haghighi, P., Khalil, I., Kotsis, G. (eds) Information Integration and Web Intelligence. iiWAS 2022. Lecture Notes in Computer Science, vol 13635. Springer, Cham. https://doi.org/10.1007/978-3-031-21047-1_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-21047-1_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-21046-4

  • Online ISBN: 978-3-031-21047-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics