Skip to main content

The Grid Backfilling: a Multi-Site Scheduling Architecture with Data Mining Prediction Techniques

  • Chapter
Grid Middleware and Services

In large Grids, like the National Grid Service (NGS), or large distributed architecture different scheduling entities are involved. Despite a global scheduling approach would archive higher performance and could increment the utilization of global system in these scenarios usually independent schedulers carry out its own scheduling decisions. In this paper we present howa coordinated scheduling among all the different centers using data mining prediction techniques can substantially improve the performance of the global distributed infrastructure, and can provide a uniform access to the user to all the heterogeneous Grid resources. We present the Grid Backfilling meta-scheduling policy that optimizes the global utilization of the system resources and increases substantially the response time for the jobs. We also present how data mining techniques applied to historical information can provide very suitable inputs for carrying out the Grid Backfilling meta-scheduling decisions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

 References

  1. S.-H. Chiang, A. C. Arpaci-Dusseau, and M. K. Vernon. The impact of more accurate requested runtimes on production job scheduling performance. 8th International Workshop on Job Scheduling Strategies for Parallel Processing, Vol. 2537:103-127, 2002.

    Article  Google Scholar 

  2. M. V. Devarakonda and R. K. Iyer. Predictability of process resource usage : A measure-ment based study on unix. IEEE Tans. Sotfw. Eng., pp. 1579-1586, 1989

    Google Scholar 

  3. P. Dinda. Online prediction of the running time of tasks. Cluster Computing SIGMET-RICS/Performance, pages 225-236, 2002.

    Google Scholar 

  4. A. B. Downey. Using queue time predictions for processor allocation. 3rd JSSPP, Lecture Notes In Computer Science; Vol. 1291:35-57, 1997.

    Google Scholar 

  5. C. Ernemann, V. Hamscher, , and R. Yahyapour. Benefits of global grid computing for job scheduling. 5th IEEE/ACM International Workshop on Grid Computing, 2004.

    Google Scholar 

  6. D. G. Feitelson and M. A. Jette. Improved utilization and responsiveness with gang scheduling. pages 238-261, 1997.

    Google Scholar 

  7. D. G. Feitelson, L. Rudolph, and U. Schwiegelshohn. Parallel job scheduling - a status report. Job Scheduling Strategies for Parallel Processing: 10th International Workshop, JSSPP 2004, 3277 / 2005:9, June 2004.

    Google Scholar 

  8. S. Gerald, K. Rajkumar, R. Arun, and S. Ponnuswamy. Scheduling of parallel jobs in a heterogeneous multi-site environment. JSSPP, 2003.

    Google Scholar 

  9. R. Gibbons. A historical application profiler for use by parallel schedulers. Job Scheduling Strategies for Parallel Processing 1997, 1997.

    Google Scholar 

  10. A. Goyenechea, F. Guim, I. Rodero, G. Terstyansky, and J. Corbalan. Extracting per- formance hints for grid users using data mining techniques: a case study in the ngs. ”Mediterranean Journal: Special issue on data mining, 2006.

    Google Scholar 

  11. J. Han and M. Kamber. Book: Data mining: Concepts and techniques. Book, 2001.

    Google Scholar 

  12. G. Holmes, A. Donkin, and I. Witten. Weka: A machine learning workbench. In Proc 2nd Australia and New Zealand Conf. on Intelligent Information Systems, 1994

    Google Scholar 

  13. T. C. Jess Labarta, Sergi Girona. Analyzing scheduling policies using dimemas. 3rd Work- shop on environment and tools for parallel scientific computation, 1997.

    Google Scholar 

  14. C. Pinchak, P. Lu, and M. Goldenberg. Practical heterogeneous placeholder scheduling in overlay metacomputers: Early experiences. Job Scheduling Strategies for Parallel Processing, pages 205-228, 2002. Lect. Notes Comput. Sci. vol. 2537.

    Google Scholar 

  15. B. Schroeder and M. Harchol-Balter. Evaluation of task assignment policies for super- computing servers: The case for load unbalancing and fairness. Cluster Computing 2004.

    Google Scholar 

  16. J. Skovira, W. Chan, H. Zhou, and D. A. Lifka. The easy - loadleveler api project. Pro- ceedings of the Workshop on Job Scheduling Strategies for Parallel Processing, Lecture Notes In Computer Science; Vol. 1162:41-47, 1996.

    Article  Google Scholar 

  17. W. Smith, V. E. Taylor, and I. T. Foster. Using run-time predictions to estimate queue wait times and improve scheduler performance. Proceedings of the Job Scheduling Strategies for Parallel Processing, Lecture Notes In Computer Science; Vol. 1659:202-219, 1999.

    Article  Google Scholar 

  18. D. Tsafrir, Y. Etsion, , and D. G. Feitelson. Modeling user runtime estimates. In the 11th JSSPP ,Lecture Notes in Computer Science, Vol.3834:pp. 1-35, 2006.

    Article  Google Scholar 

  19. D. Tsafrir, Y. Etsion, and D. G. Feitelson. Backfilling using system-generated predictions rather than user runtime estimates. In the IEEE TPDS, 2006.

    Google Scholar 

  20. D. Tsafrir and D. G. Feitelson. Instability in parallel job scheduling simulation: the role of workload flurries. In 20th Intl. Parallel and Distributed Processing Symp, 2006.

    Google Scholar 

  21. J. Yue. Global backfilling scheduling in multiclusters. Asian Applied Computing Confer- ence, AACC 2004, pages pp. 232-239, 2004.

    Google Scholar 

  22. Y. Zhang, W. Sun, , and Y. Inoguchi. Cpu load predictions on the computational grid. Cluster and Grid computing, 2006.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Rodero, I., Guim, F., Corbalan, J., Goyeneche, A. (2008). The Grid Backfilling: a Multi-Site Scheduling Architecture with Data Mining Prediction Techniques. In: Grid Middleware and Services. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-78446-5_10

Download citation

  • DOI: https://doi.org/10.1007/978-0-387-78446-5_10

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-78445-8

  • Online ISBN: 978-0-387-78446-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics