Skip to main content

A Power-Aware Autonomic Approach for Performance Management of Scientific Applications in a Data Center Environment

  • Chapter
  • First Online:
Handbook on Data Centers

Abstract

In the recent years, computer servers and data center facilities that provide high performance computing (HPC) for scientific applications have largely increased in numbers and have become great consumers of electrical power. Supercomputers often run at their peak performance for an efficient execution of scientific applications, and therefore consume an enormous amount of power that results in increased operational cost. Furthermore, an increase in the power consumption results in an increase in the temperature of the physical HPC systems, which in turn translates into increased failure rates and decreased reliability. Slowing down these HPC systems by reducing the individual speed of the processors, results in a loss of execution performance of the scientific application, due to the variation in processing speed. Another cause of the degradation in the execution performance of scientific applications is the variation in the computational resource availability due to its utilization by other applications executing on the same computing node in a space shared manner. The variations in processor availability can lead to severe performance degradation in the execution environment due to load imbalance and a violation of the performance objectives, such as meeting a deadline, and therefore it may result in high penalty in terms of revenue loss to the service providers. In this chapter, a utility based power-aware approach has been presented that uses a model-based control theoretic framework for executing scientific applications. The approach and related simulations indicate that the performance and the power requirements of the system can dynamically be adjusted, while maintaining the predefined quality of service (QoS) goals in terms of deadline of execution and power consumption of the HPC system, even in the presence of computational resource related perturbations. This approach is autonomic, performance directed, dynamically controlled, and independent of (does not interfere with) the execution of the application.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Report to congress on server and data center energy efficiency public law 109-431. Technical report, U.S. Environmental Protection Agency ENERGY STAR Program, August 2 2007.

    Google Scholar 

  2. A simple way to estimate the cost of downtime. In Proceedings of the 16th USENIX conference on System administration (LISA '02), pages 185–188, Berkeley, CA, USA, 2002. USENIX Association.

    Google Scholar 

  3. Wu chun Feng, Xizhou Feng, and Rong Ge. Green supercomputing comes of age. IT Professional, 10(1):17–23, 2008.

    Article  Google Scholar 

  4. W. Feng. Green destiny + mpiblast = bioinfomagic. In 10th International Conference on Parallel Computing (PARCO), pages 653–660, 2003.

    Google Scholar 

  5. Rong Ge, Xizhou Feng, Wu-chun Feng, and Kirk W. Cameron. Cpu miser: A performance-directed, run-time system for power-aware clusters. In Proceedings of the 2007 International Conference on Parallel Processing (ICPP '07), page 18, Washington, DC, USA, 2007. IEEE Computer Society.

    Google Scholar 

  6. R. Ge and K.W. Cameron. Power-aware speedup. In Proceedings of the IEEE International on Parallel and Distributed Processing Symposium (IPDPS)., pages 1–10, March 2007.

    Google Scholar 

  7. Chung-hsing Hsu and Wu-chun Feng. A power-aware run-time system for high-performance computing. In Proceedings of the ACM/IEEE conference on Supercomputing (SC '05), page 1, Washington, DC, USA, 2005. IEEE Computer Society.

    Google Scholar 

  8. Ioana Banicescu and Ricolindo L. Carino. Addressing the stochastic nature of scientific computations via dynamic loop scheduling. Electronic Transactions on Numerical Analysis 21:66-80, 2005.

    Google Scholar 

  9. Rajat Mehrotra, Ioana Banicescu, and Srishti Srivastava. A utility based power-aware autonomic approach for running scientific applications. In Proceedings of IEEE 26th International Parallel and Distributed Processing Symposium (IPDPS), pages 1457–1466, 2012.

    Google Scholar 

  10. David A. Patterson and John L. Hennessy. Computer Organization and Design, The Hardware/Software Interface, 4th Edition. Morgan Kaufmann, 2008.

    Google Scholar 

  11. Yongpeng Liu and Hong Zhu. A survey of the research on power management techniques for high-performance systems. Software: Practice and Experience, 40(11):943–964, October 2010.

    Google Scholar 

  12. M. Nakao, H. Hayama, and M. Nishioka. Which cooling air supply system is better for a high heat density room: underfloor or overhead? In Proceedings of Telecommunications Energy Conference, (INTELEC '91), pages 393–400, 1991.

    Google Scholar 

  13. H. Hayama and M. Nakao. Air flow systems for telecommunications equipment rooms. In Proceedings of Telecommunications Energy Conference (INTELEC '89), pages 8.3/1–8.3/7 vol.1, 1989.

    Google Scholar 

  14. Taliver Heath, Ana Paula Centeno, Pradeep George, Luiz Ramos, Yogesh Jaluria, and Ricardo Bianchini. Mercury and freon: temperature emulation and management for server systems. In Proceedings of the 12th international conference on Architectural support for programming languages and operating systems, ASPLOS XII, pages 106–116, New York, NY, USA, 2006. ACM.

    Google Scholar 

  15. Justin Moore, Jeff Chase, Parthasarathy Ranganathan, and Ratnesh Sharma. Making scheduling “cool”: temperature-aware workload placement in data centers. In Proceedings of the annual conference on USENIX Annual Technical Conference, ATEC '05, pages 5–5, Berkeley, CA, USA, 2005. USENIX Association.

    Google Scholar 

  16. Tridib Mukherjee, Ayan Banerjee, Georgios Varsamopoulos, Sandeep K. S. Gupta, and Sanjay Rungta. Spatio-temporal thermal-aware job scheduling to minimize energy consumption in virtualized heterogeneous data centers. Computer Networks, 53(17):2888–2904, December 2009.

    Google Scholar 

  17. Eun Kyung Lee, Indraneel Kulkarni, Dario Pompili, and Manish Parashar. Proactive thermal management in green datacenters. Journal of Supercomput., 60(2):165–195, May 2012.

    Article  Google Scholar 

  18. Blue gene. http://www-03.ibm.com/ibm/history/ibm100/us/en/icons/bluegene/ [May 2013].

  19. Severin Zimmermann, Ingmar Meijer, Manish K. Tiwari, Stephan Paredes, Bruno Michel, and Dimos Poulikakos. Aquasar: A hot water cooled data center with direct energy reuse. Energy, 43(1):237–245, 2012. 2nd International Meeting on Cleaner Combustion (CM0901-Detailed Chemical Models for Cleaner Combustion).

    Article  Google Scholar 

  20. Chung-Hsing Hsu and Wu-Chun Feng. Effective dynamic voltage scaling through cpu-boundedness detection. In In Workshop on Power Aware Computing Systems, pages 135–149, 2004.

    Google Scholar 

  21. Vincent W. Freeh, David K. Lowenthal, Feng Pan, Nandini Kappiah, Rob Springer, Barry L. Rountree, and Mark E. Femal. Analyzing the energy-time trade-off in high-performance computing applications. IEEE Trans. Parallel Distrib. Syst., 18:835–848, June 2007.

    Google Scholar 

  22. Michael Knobloch. Chapter 1 - energy-aware high performance computing—a survey. In Ali Hurson, editor, Green and Sustainable Computing: Part II, volume 88 of Advances in Computers, pages 1–78. Elsevier, 2013.

    Google Scholar 

  23. B. J. Smith. Architecture and applications of the hep multiprocessor computer system. In SPIE - Real-Time Signal Processing IV, pages 241–248, 1981.

    Google Scholar 

  24. Clyde P. Kruskal and Alan Weiss. Allocating independent subtasks on parallel processors. IEEE Trans. Softw. Eng., 11(10):1001–1016, 1985.

    Google Scholar 

  25. T. H. Tzen and L. M. Ni. Trapezoid self-scheduling: A practical scheduling scheme for parallel compilers. IEEE Trans. Parallel Distrib. Syst., 4(1):87–98, 1993.

    Article  Google Scholar 

  26. Susan Flynn Hummel, Edith Schonberg, and Lawrence E. Flynn. Factoring: a method for scheduling parallel loops. Communication of ACM, 35(8):90–101, 1992.

    Google Scholar 

  27. Ioana Banicescu and Susan Flynn Hummel. Balancing processor loads and exploiting data locality in n-body simulations. In Proceedings of the 1995 ACM/IEEE Conference on Supercomputing, Supercomputing '95 (on CDROM), pages 43–55, New York, NY, USA, 1995. ACM.

    Google Scholar 

  28. Susan Flynn Hummel, Jeanette Schmidt, R. N. Uma, and Joel Wein. Load-sharing in heterogeneous systems via weighted factoring. In Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures (SPAA '96), pages 318–328, New York, NY, USA, 1996. ACM.

    Google Scholar 

  29. Ioana Banicescu and Vijay Velusamy. Performance of scheduling scientific applications with adaptive weighted factoring. In Proceedings of the 15th International Parallel & Distributed Processing Symposium (IPDPS '01), page 84, Washington, DC, USA, 2001. IEEE Computer Society.

    Google Scholar 

  30. Ricolindo L. Carino Cariño and Ioana Banicescu. Dynamic load balancing with adaptive factoring methods in scientific applications. The Journal of Supercomputing, 44(1):41–63, 2008.

    Article  Google Scholar 

  31. Ioana Banicescu, Vijay Velusamy, and Johnny Devaprasad. On the scalability of dynamic scheduling scientific applications with adaptive weighted factoring. Cluster Computing, 6(3):215–226, 2003.

    Article  Google Scholar 

  32. Ioana Banicescu and Vijay Velusamy. Load balancing highly irregular computations with the adaptive factoring. In 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 15-19 April 2002, Fort Lauderdale, FL, USA, CD-ROM/Abstracts Proceedings. IEEE Computer Society, 2002.

    Google Scholar 

  33. Ricolindo Cari˜no, Ioana Banicescu, Thomas Rauber, and Gudula Rünger. Dynamic loop scheduling with processor groups. In Proceedings of the ISCA Parallel and distributed Computing Symposium (PDCS), pages 78–84, 2004.

    Google Scholar 

  34. Yong Dong, Juan Chen, Xuejun Yang, Lin Deng, and Xuemeng Zhang. Energy-oriented openmp parallel loop scheduling. In Proceedings of the 2008 IEEE International Symposium on Parallel and Distributed Processing with Applications, pages 162–169, Washington, DC, USA, 2008. IEEE Computer Society.

    Google Scholar 

  35. Anton Cervin, Johan Eker, Bo Bernhardsson, and Karl-Erik Arzen. Feedback–feedforward scheduling of control tasks. Real-Time Systems, 23(1/2):25–53, 2002.

    Article  MATH  Google Scholar 

  36. T.F. Abdelzaher, K.G. Shin, and N. Bhatti. Performance guarantees for web server end-systems: a control-theoretical approach. IEEE Transactions on Parallel and Distributed Systems, 13(1):80–96, Jan 2002.

    Article  Google Scholar 

  37. R. Mehrotra, A. Dubey, S. Abdelwahed, and W. Monceaux. Large scale monitoring and online analysis in a distributed virtualized environment. In 8th IEEE International Conference and Workshops on Engineering of Autonomic and Autonomous Systems (EASe), 2011, pages 1–9, 2011.

    Google Scholar 

  38. Chenyang Lu, Guillermo A. Alvarez, and John Wilkes. Aqueduct: Online data migration with performance guarantees. In FAST '02: Proceedings of the 1st USENIX Conference on File and Storage Technologies, page 21, Berkeley, CA, USA, 2002. USENIX Association.

    Google Scholar 

  39. R. Mehrotra, A. Dubey, S. Abdelwahed, and A. Tantawi. Integrated monitoring and control for performance management of distributed enterprise systems. In 2010 IEEE International Symposium on Modeling, Analysis Simulation of Computer and Telecommunication Systems (MASCOTS), pages 424–426, 2010.

    Google Scholar 

  40. Rajat Mehrotra, Abhishek Dubey, Sherif Abdelwahed, and Asser Tantawi. A Power-aware Modeling and Autonomic Management Framework for Distributed Computing Systems. CRC Press, 2011.

    Google Scholar 

  41. Dara Kusic, Nagarajan Kandasamy, and Guofei Jiang. Approximation modeling for the online performance management of distributed computing systems. In ICAC '07: Proceedings of the Fourth International Conference on Autonomic Computing, page 23, Washington, DC, USA, 2007. IEEE Computer Society.

    Google Scholar 

  42. Rajat Mehrotra, Abhishek Dubey, Sherif Abdelwahed, and Asser Tantawi. Model identification for performance management of distributed enterprise systems. (ISIS-10-104), 2010.

    Google Scholar 

  43. S. Abdelwahed, Nagarajan Kandasamy, and Sandeep Neema. Online control for self-management in computing systems. In Proceedings of Real-Time and Embedded Technology and Applications Symposium,(RTAS) 2004., pages 368–375, 2004.

    Google Scholar 

  44. Abhishek Dubey, Rajat Mehrotra, Sherif Abdelwahed, and Asser Tantawi. Performance modeling of distributed multi-tier enterprise systems. SIGMETRICS Performance Evaluation Review, 37(2):9–11, 2009.

    Article  Google Scholar 

  45. S. Abdelwahed, Jia Bai, Rong Su, and Nagarajan Kandasamy. On the application of predictive control techniques for adaptive performance management of computing systems. IEEE Transactions on Network and Service Management, 6(4):212–225, 2009.

    Article  Google Scholar 

Download references

Acknowledgment

The authors would like to thank the National Science Foundation (NSF) for its support of this work through the grant NSF IIP-1034897.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rajat Mehrotra .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer Science+Business Media New York

About this chapter

Cite this chapter

Mehrotra, R., Banicescu, I., Srivastava, S., Abdelwahed, S. (2015). A Power-Aware Autonomic Approach for Performance Management of Scientific Applications in a Data Center Environment. In: Khan, S., Zomaya, A. (eds) Handbook on Data Centers. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-2092-1_5

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-2092-1_5

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4939-2091-4

  • Online ISBN: 978-1-4939-2092-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics