Skip to main content

Improving Server Performance on Multi-cores via Selective Off-Loading of OS Functionality

  • Conference paper
Computer Architecture (ISCA 2010)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6161))

Included in the following conference series:

Abstract

Modern and future server-class processors will incorporate many cores. Some studies have suggested that it may be worthwhile to dedicate some of the many cores for specific tasks such as operating system execution. OS off-loading has two main benefits: improved performance due to better cache utilization and improved power efficiency due to smarter use of heterogeneous cores. However, OS off-loading is a complex process that involves balancing the overheads of off-loading against the potential benefit, which is unknown while making the off-loading decision. In prior work, OS off-loading has been implemented by first profiling system call behavior and then manually instrumenting some OS routines (out of hundreds) to support off-loading. We propose a hardware-based mechanism to help automate the off-load decision-making process, and provide high quality dynamic decisions via performance feedback. Our mechanism dynamically estimates the off-load requirements of the application and relies on a run-length predictor for the upcoming OS system call invocation. The resulting hardware based off-loading policy yields a throughput improvement of up to 18% over a baseline without off-loading, 13% over a static software based policy, and 23% over a dynamic software based policy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. The SPARC Architecture Manual Version 9, http://www.sparc.org/standards/SPARCV9.pdf

  2. Agarwal, A., Hennessy, J., Horowitz, M.: Cache Performance of Operating System and Multiprogramming Workloads. ACM Trans. Comput. Syst. 6(4), 393–431 (1988)

    Article  Google Scholar 

  3. Albayraktaroglu, K., Jaleel, A., Wu, X., Franklin, M., Jacob, B., Tseng, C.W., Yeung, D.: BioBench: A Benchmark Suite of Bioinformatics Applications. In: Proceedings of ISPASS (2005)

    Google Scholar 

  4. Anderson, T.E., Levy, H.M., Bershad, B.N., Lazowska, E.D.: The Interaction of Architecture and Operating System Design. In: Proceedings of ASPLOS (1991)

    Google Scholar 

  5. Balasubramonian, R., Dwarkadas, S., Albonesi, D.: Dynamically Managing the Communication-Parallelism Trade-Off in Future Clustered Processors. In: Proceedings of ISCA-30, pp. 275–286 (June 2003)

    Google Scholar 

  6. Barroso, L., Holzle, U.: The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines. Morgan & Claypool, San Francisco (2009)

    Google Scholar 

  7. Baumann, A., Barham, P., Dagand, P., Harris, T., Isaacs, R., Peter, S., Roscoe, T., Schupbach, A., Singhania, A.: The Multikernel: A new OS architecture for scalable multicore systems. In: Proceedings of SOSP (October 2009)

    Google Scholar 

  8. Benia, C., et al.: The PARSEC Benchmark Suite: Characterization and Architectural Implications. Tech. rep., Department of Computer Science, Princeton University (2008)

    Google Scholar 

  9. Brown, J.A., Tullsen, D.M.: The Shared-Thread Multiprocessor. In: Proceedings of ICS (2008)

    Google Scholar 

  10. Chakraborty, K., Wells, P.M., Sohi, G.S.: Computation Spreading: Employing Hardware Migration to Specialize CMP Cores On-the-Fly. In: Proceedings of ASPLOS (2006)

    Google Scholar 

  11. Gloy, N., Young, C., Chen, J.B., Smith, M.D.: An Analysis of Dynamic Branch Prediction Schemes on System Workloads. In: Proceedings of ISCA (1996)

    Google Scholar 

  12. Henning, J.L.: SPEC CPU2006 Benchmark Descriptions. In: Proceedings of ACM SIGARCH Computer Architecture News (2005)

    Google Scholar 

  13. Hunt, G., Larus, J.: Singularity: rethinking the software stack. Operating Systems Review (2007)

    Google Scholar 

  14. Li, T., John, L., Sivasubramaniam, A., Vijaykrishnan, N., Rubio, J.: Understanding and Improving Operating System Effects in Control Flow Prediction. Operating Systems Review (December 2002)

    Google Scholar 

  15. Li, T., John, L.K.: Operating System Power Minimization through Run-time Processor Resource Adaptation. IEEE Microprocessors and Microsystems 30, 189–198 (2006)

    Article  Google Scholar 

  16. Magnusson, P., Christensson, M., Eskilson, J., Forsgren, D., Hallberg, G., Hogberg, J., Larsson, F., Moestedt, A., Werner, B.: Simics: A Full System Simulation Platform. IEEE Computer 35(2), 50–58 (2002)

    Article  Google Scholar 

  17. Mogul, J., Mudigonda, J., Binkert, N., Ranganathan, P., Talwar, V.: Using Asymmetric Single-ISA CMPs to Save Energy on Operating Systems. IEEE Micro (May-June 2008)

    Google Scholar 

  18. Muralimanohar, N., Balasubramonian, R., Jouppi, N.: Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0. In: Proceedings of MICRO (2007)

    Google Scholar 

  19. Nellans, D., Balasubramonian, R., Brunvand, E.: A Case for Increased Operating System Support in Chip Multi-Processors. In: Proceedings of the 2nd IBM Watson Conference on Interaction between Architecture, Circuits, and Compilers (September 2005)

    Google Scholar 

  20. Nellans, D., Balasubramonian, R., Brunvand, E.: OS Execution on Multi-Cores: Is Out-Sourcing Worthwhile? ACM Operating System Review (April 2009)

    Google Scholar 

  21. Redstone, J., Eggers, S.J., Levy, H.M.: An Analysis of Operating System Behavior on a Simultaneous Multithreaded Architecture. In: Proceedings of ASPLOS (2000)

    Google Scholar 

  22. Strong, R., Mudigonda, J., Mogul, J., Binkert, N., Tullsen, D.: Fast Switching of Threads Between Cores. Operating Systems Review (April 2009)

    Google Scholar 

  23. U.S. Environmental Protection Agency - Energy Star Program: Report To Congress on Server and Data Center Energy Efficiency - Public Law 109-431 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nellans, D., Sudan, K., Brunvand, E., Balasubramonian, R. (2011). Improving Server Performance on Multi-cores via Selective Off-Loading of OS Functionality. In: Varbanescu, A.L., Molnos, A., van Nieuwpoort, R. (eds) Computer Architecture. ISCA 2010. Lecture Notes in Computer Science, vol 6161. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24322-6_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-24322-6_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-24321-9

  • Online ISBN: 978-3-642-24322-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics