Skip to main content

Efficient Graph Mining on Heterogeneous Platforms in the Cloud

  • Conference paper
  • First Online:
Cloud Computing, Security, Privacy in New Computing Environments (CloudComp 2016, SPNCE 2016)

Abstract

In this Big Data era, many large-scale and complex graphs have been produced with the rapid growth of novel Internet applications and the new experiment data collecting methods in biological and chemistry areas. As the scale and complexity of the graph data increase explosively, it becomes urgent and challenging to develop more efficient graph processing frameworks which are capable of executing general graph algorithms efficiently. In this paper, we propose to leverage GPUs to accelerate large-scale graph mining in the cloud. To achieve good performance and scalability, we propose the graph summary method and runtime system optimization techniques for load balancing and message handling. Experiment results manifest that the prototype framework outperforms two state-of-the-art distributed frameworks GPS and GraphLab in terms of performance and scalability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Kyrola, A., Blelloch, G., Guestrin, C.: GraphChi: large-scale graph computation on just a PC. In: The 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 12), pp. 31–46 (2012)

    Google Scholar 

  2. Low, Y., Bickson, D., Gonzalez, J., Guestrin, C., Kyrola, A., Hellerstein, J.M.: Distributed GraphLab: a framework for machine learning and data mining in the cloud. Proc. VLDB Endow. 5(8), 716–727 (2012)

    Article  Google Scholar 

  3. Shao, B., Wang, H., Li, Y.: Trinity: a distributed graph engine on a memory cloud. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp. 505–516. ACM (2013)

    Google Scholar 

  4. Salihoglu, S., Widom, J.: GPS: a graph processing system. In: Proceedings of the 25th International Conference on Scientific and Statistical Database Management, p. 22. ACM (2013)

    Google Scholar 

  5. Malewicz, G., Austern, M.H., Bik, A.J., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pp. 135–146. ACM (2010)

    Google Scholar 

  6. Warneke, D., Kao, O.: Nephele: efficient parallel data processing in the cloud. In: Proceedings of the 2nd Workshop on Many-task Computing on Grids and Supercomputers, p. 8. ACM (2009)

    Google Scholar 

  7. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)

    Article  Google Scholar 

  8. Guo, Y., Biczak, M., Varbanescu, A.L., Iosup, A., Martella, C., Willke, T.L.: How well do graph-processing platforms perform? an empirical performance evaluation and analysis. In: 2014 IEEE 28th International Parallel and Distributed Processing Symposium, pp. 395–404. IEEE (2014)

    Google Scholar 

  9. Pan, X.: A comparative evaluation of open-source graph processing platforms. In: 2016 17th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD), pp. 325–330. IEEE (2016)

    Google Scholar 

  10. Gonzalez, J.E., Low, Y., Gu, H., Bickson, D., Guestrin, C.: Powergraph: Distributed graph-parallel computation on natural graphs. In: the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2012), pp. 17–30 (2012)

    Google Scholar 

  11. Shun, J., Blelloch, G.E.: Ligra: a lightweight graph processing framework for shared memory. ACM SIGPLAN Not. 48(8), 135–146 (2013). ACM

    Article  Google Scholar 

  12. Zhang, T., Zhang, J., Shu, W., Wu, M.Y., Liang, X.: Efficient graph computation on hybrid CPU and GPU systems. J. Supercomput. 71(4), 1563–1586 (2015)

    Article  Google Scholar 

  13. Gharaibeh, A., Reza, T., Santos-Neto, E., Costa, L.B., Sallinen, S., Ripeanu, M.: Efficient large-scale graph processing on hybrid CPU and GPU systems (2013). arxiv preprint arXiv:1312.3018

  14. Zhang, T., Jing, N., Jiang, K., Shu, W., Wu, M.Y., Liang, X.: Buddy SM: sharing pipeline front-end for improved energy efficiency in GPGPUs. ACM Trans. Archit. Code Optim. (TACO) 12(2), 1–23 (2015). Article no. 16

    Article  Google Scholar 

  15. Zhang, T., Shu, W., Wu, M.Y.: CUIRRE: an open-source library for load balancing and characterizing irregular applications on GPUs. J. Parallel Distrib. Comput. 74(10), 2951–2966 (2014)

    Article  Google Scholar 

  16. Iyer, A.P., Li, L.E., Das, T., Stoica, I.: Time-evolving graph processing at scale. In: Proceedings of the Fourth International Workshop on Graph Data Management Experiences and Systems, p. 5. ACM (2016)

    Google Scholar 

  17. Cheng, R., Hong, J., Kyrola, A., Miao, Y., Weng, X., Wu, M., Chen, E.: Kineograph: taking the pulse of a fast-changing and connected world. In: Proceedings of the 7th ACM European Conference on Computer Systems, pp. 85–98. ACM (2012)

    Google Scholar 

  18. Murray, D.G., McSherry, F., Isaacs, R., Isard, M., Barham, P., Abadi, M.: Naiad: a timely dataflow system. In: Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, pp. 439–455. ACM (2013)

    Google Scholar 

  19. Wickramaarachchi, C., Chelmis, C., Prasanna, V.K.: Empowering fast incremental computation over large scale dynamic graphs. In: 2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW), pp. 1166–1171. IEEE (2015)

    Google Scholar 

  20. Zhang, Y., Gao, Q., Gao, L., Wang, C.: Maiter: an asynchronous graph processing framework for delta-based accumulative iterative computation. IEEE Trans. Parallel Distrib. Syst. 25(8), 2091–2100 (2014)

    Article  Google Scholar 

  21. Han, S., Lei, Z., Shen, W., Chen, S., Zhang, H., Zhang, T., Xu, B.: An approach to improving the performance of CUDA in virtual environment. In: 2016 17th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD), pp. 585–590. IEEE (2016)

    Google Scholar 

  22. Jing, N., Jiang, L., Zhang, T., Li, C., Fan, F., Liang, X.: Energy-efficient eDRAM-based on-chip storage architecture for GPGPUs. IEEE Trans. Comput. 65(1), 122–135 (2016)

    Article  MATH  MathSciNet  Google Scholar 

  23. Wang, K., Xu, G., Su, Z., Liu, Y.D.: GraphQ: graph query processing with abstraction refinement scalable and programmable analytics over very large graphs on a single PC. In: 2015 USENIX Annual Technical Conference (USENIX ATC 15), pp. 387–401 (2015)

    Google Scholar 

  24. Valiant, L.G.: A bridging model for parallel computation. Commun. ACM 33(8), 103–111 (1990)

    Article  Google Scholar 

  25. Chakrabarti, D., Zhan, Y., Faloutsos, C.: R-MAT: a recursive model for graph mining. SDM 4, 442–446 (2004)

    MathSciNet  Google Scholar 

Download references

Acknowledgment

This research is supported by Young Teachers Program of Shanghai Colleges and Universities under grant No. ZZSD15072, Natural Science Foundation of Shanghai under grant No. 16ZR1411200, and Shanghai Innovation Action Plan Project under grant No. 16511101200.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tao Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, T., Tong, W., Shen, W., Peng, J., Niu, Z. (2018). Efficient Graph Mining on Heterogeneous Platforms in the Cloud. In: Wan, J., et al. Cloud Computing, Security, Privacy in New Computing Environments. CloudComp SPNCE 2016 2016. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 197. Springer, Cham. https://doi.org/10.1007/978-3-319-69605-8_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-69605-8_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-69604-1

  • Online ISBN: 978-3-319-69605-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics