The Grid Backfilling: a Multi-Site Scheduling Architecture with Data Mining Prediction Techniques

Rodero, Ivan; Guim, Francesc; Corbalan, Julita; Goyeneche, A.

doi:10.1007/978-0-387-78446-5_10

Ivan Rodero⁴,
Francesc Guim⁴,
Julita Corbalan⁴ &
…
A. Goyeneche⁵

451 Accesses
8 Citations

In large Grids, like the National Grid Service (NGS), or large distributed architecture different scheduling entities are involved. Despite a global scheduling approach would archive higher performance and could increment the utilization of global system in these scenarios usually independent schedulers carry out its own scheduling decisions. In this paper we present howa coordinated scheduling among all the different centers using data mining prediction techniques can substantially improve the performance of the global distributed infrastructure, and can provide a uniform access to the user to all the heterogeneous Grid resources. We present the Grid Backfilling meta-scheduling policy that optimizes the global utilization of the system resources and increases substantially the response time for the jobs. We also present how data mining techniques applied to historical information can provide very suitable inputs for carrying out the Grid Backfilling meta-scheduling decisions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

S.-H. Chiang, A. C. Arpaci-Dusseau, and M. K. Vernon. The impact of more accurate requested runtimes on production job scheduling performance. 8th International Workshop on Job Scheduling Strategies for Parallel Processing, Vol. 2537:103-127, 2002.
Article Google Scholar
M. V. Devarakonda and R. K. Iyer. Predictability of process resource usage : A measure-ment based study on unix. IEEE Tans. Sotfw. Eng., pp. 1579-1586, 1989
Google Scholar
P. Dinda. Online prediction of the running time of tasks. Cluster Computing SIGMET-RICS/Performance, pages 225-236, 2002.
Google Scholar
A. B. Downey. Using queue time predictions for processor allocation. 3rd JSSPP, Lecture Notes In Computer Science; Vol. 1291:35-57, 1997.
Google Scholar
C. Ernemann, V. Hamscher, , and R. Yahyapour. Benefits of global grid computing for job scheduling. 5th IEEE/ACM International Workshop on Grid Computing, 2004.
Google Scholar
D. G. Feitelson and M. A. Jette. Improved utilization and responsiveness with gang scheduling. pages 238-261, 1997.
Google Scholar
D. G. Feitelson, L. Rudolph, and U. Schwiegelshohn. Parallel job scheduling - a status report. Job Scheduling Strategies for Parallel Processing: 10th International Workshop, JSSPP 2004, 3277 / 2005:9, June 2004.
Google Scholar
S. Gerald, K. Rajkumar, R. Arun, and S. Ponnuswamy. Scheduling of parallel jobs in a heterogeneous multi-site environment. JSSPP, 2003.
Google Scholar
R. Gibbons. A historical application profiler for use by parallel schedulers. Job Scheduling Strategies for Parallel Processing 1997, 1997.
Google Scholar
A. Goyenechea, F. Guim, I. Rodero, G. Terstyansky, and J. Corbalan. Extracting per- formance hints for grid users using data mining techniques: a case study in the ngs. ”Mediterranean Journal: Special issue on data mining, 2006.
Google Scholar
J. Han and M. Kamber. Book: Data mining: Concepts and techniques. Book, 2001.
Google Scholar
G. Holmes, A. Donkin, and I. Witten. Weka: A machine learning workbench. In Proc 2nd Australia and New Zealand Conf. on Intelligent Information Systems, 1994
Google Scholar
T. C. Jess Labarta, Sergi Girona. Analyzing scheduling policies using dimemas. 3rd Work- shop on environment and tools for parallel scientific computation, 1997.
Google Scholar
C. Pinchak, P. Lu, and M. Goldenberg. Practical heterogeneous placeholder scheduling in overlay metacomputers: Early experiences. Job Scheduling Strategies for Parallel Processing, pages 205-228, 2002. Lect. Notes Comput. Sci. vol. 2537.
Google Scholar
B. Schroeder and M. Harchol-Balter. Evaluation of task assignment policies for super- computing servers: The case for load unbalancing and fairness. Cluster Computing 2004.
Google Scholar
J. Skovira, W. Chan, H. Zhou, and D. A. Lifka. The easy - loadleveler api project. Pro- ceedings of the Workshop on Job Scheduling Strategies for Parallel Processing, Lecture Notes In Computer Science; Vol. 1162:41-47, 1996.
Article Google Scholar
W. Smith, V. E. Taylor, and I. T. Foster. Using run-time predictions to estimate queue wait times and improve scheduler performance. Proceedings of the Job Scheduling Strategies for Parallel Processing, Lecture Notes In Computer Science; Vol. 1659:202-219, 1999.
Article Google Scholar
D. Tsafrir, Y. Etsion, , and D. G. Feitelson. Modeling user runtime estimates. In the 11th JSSPP ,Lecture Notes in Computer Science, Vol.3834:pp. 1-35, 2006.
Article Google Scholar
D. Tsafrir, Y. Etsion, and D. G. Feitelson. Backfilling using system-generated predictions rather than user runtime estimates. In the IEEE TPDS, 2006.
Google Scholar
D. Tsafrir and D. G. Feitelson. Instability in parallel job scheduling simulation: the role of workload flurries. In 20th Intl. Parallel and Distributed Processing Symp, 2006.
Google Scholar
J. Yue. Global backfilling scheduling in multiclusters. Asian Applied Computing Confer- ence, AACC 2004, pages pp. 232-239, 2004.
Google Scholar
Y. Zhang, W. Sun, , and Y. Inoguchi. Cpu load predictions on the computational grid. Cluster and Grid computing, 2006.
Google Scholar

Download references

Author information

Authors and Affiliations

Barcelona SuperComputing Center, 08034, Barcelona, Spain
Ivan Rodero, Francesc Guim & Julita Corbalan
University ofWestminster, W1W 6UW, London, UK
A. Goyeneche

Authors

Ivan Rodero
View author publications
You can also search for this author in PubMed Google Scholar
Francesc Guim
View author publications
You can also search for this author in PubMed Google Scholar
Julita Corbalan
View author publications
You can also search for this author in PubMed Google Scholar
A. Goyeneche
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Rodero, I., Guim, F., Corbalan, J., Goyeneche, A. (2008). The Grid Backfilling: a Multi-Site Scheduling Architecture with Data Mining Prediction Techniques. In: Grid Middleware and Services. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-78446-5_10

Download citation

DOI: https://doi.org/10.1007/978-0-387-78446-5_10
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-78445-8
Online ISBN: 978-0-387-78446-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics