Skip to main content

Improved Algorithm for Finding the Minimum Cost of Storing and Regenerating Datasets in Multiple Clouds

  • Conference paper
  • First Online:
Computing and Combinatorics (COCOON 2018)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10976))

Included in the following conference series:

  • 1395 Accesses

Abstract

This paper studies intermediate datasets storage problem with linear dataflow in multiple clouds. The proliferation of cloud computing allows users to flexibly store, re-compute or transfer large generated datasets with multiple cloud service providers. However, due to the pay-as-you-go model, the total cost of using cloud services depends on the consumption of storage, computation and bandwidth resources. Given cloud service providers with different pricing models on their resources, users can flexibly choose a cloud service to store a generated dataset, or delete it and then regenerate it when needed, or transfer it to another cloud service in order to reduce the total cost for datasets storage and re-computation. The current best algorithm for finding an optimal strategy of a linear dataflow in multiple clouds takes \(O\left( m^4n^3\right) \), where m is the number of the clouds and n is the number of datasets in a dataflow. In this paper, we present an improved algorithm for the linear dataflow with time complexity \(O\left( m^3n^3\right) \).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Yuan, D., Yang, Y., Liu, X., et al.: On-demand minimum cost benchmarking for intermediate data storage in scientific cloud workflow systems. J. Parallel Distrib. Comput. 71(2), 316–332 (2011)

    Article  Google Scholar 

  2. Cheng, J., Zhu, D., Zhu, B.: Improved algorithms for intermediate dataset storage in a cloud-based dataflow. Theor. Comput. Sci. 657, 48–53 (2017)

    Article  MathSciNet  Google Scholar 

  3. Yuan, D., Yang, Y., Liu, X., et al.: A data dependency based strategy for intermediate data storage in scientific cloud workflow systems. Concurr. Comput.: Pract. Exp. 24(9), 956–976 (2010)

    Article  Google Scholar 

  4. Yuan, D., Cui, L., Li, W., et al.: An algorithm for finding the minimum cost of storing and regenerating datasets in multiple clouds. IEEE Trans. Cloud Comput. (99), 1 (2015)

    Google Scholar 

  5. Deelman, E., Chervenak, A.: Data management challenges of data-intensive scientific workflows. In: IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2008), Lyon, France, pp. 687–692 (2008). https://doi.org/10.1109/CCGRID.2008.24

  6. Adams, I., Long, D.D.E., Miller, E.L., et al.: Maximizing efficiency by trading storage for computation. In: Workshop on Hot Topics in Cloud Computing (HotCloud 2009), San Diego, CA, pp. 1–5 (2009)

    Google Scholar 

Download references

Acknowledgement

The author thanks reviewers for their constructive suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zimao Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, Y., Cheng, K., Li, Z. (2018). Improved Algorithm for Finding the Minimum Cost of Storing and Regenerating Datasets in Multiple Clouds. In: Wang, L., Zhu, D. (eds) Computing and Combinatorics. COCOON 2018. Lecture Notes in Computer Science(), vol 10976. Springer, Cham. https://doi.org/10.1007/978-3-319-94776-1_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-94776-1_35

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-94775-4

  • Online ISBN: 978-3-319-94776-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics