Skip to main content

Node Performance and Energy Analysis with the Sniper Multi-core Simulator

  • Conference paper
  • First Online:
Tools for High Performance Computing 2013
  • 475 Accesses

Abstract

Two major trends in high-performance computing, namely, larger numbers of cores and the growing size of on-chip cache memory, are creating significant challenges for evaluating the design space of future processor architectures. Fast and scalable simulations are therefore needed to allow for sufficient exploration of large multi-core systems within a limited simulation time budget. By bringing together accurate high-abstraction analytical models with fast parallel simulation, architects can trade off accuracy with simulation speed to allow for longer application runs, covering a larger portion of the hardware design space. Sniper provides this balance allowing long-running simulations to be modeled much faster than with detailed cycle-accurate simulation, while still providing the detail necessary to observe core-uncore interactions across the entire system. With per-function advanced visualization and coupled power and energy simulations, the Sniper multi-core simulator can provide a fast and accurate way both to understand and optimize software for current and future hardware systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Argollo, E., Falcón, A., Faraboschi, P., Monchiero, M., Ortega, D.: COTSon: infrastructure for Full System Simulation. ACM SIGOPS Oper. Syst. Rev. 43(1), 52–61 (2009)

    Article  Google Scholar 

  2. Carlson, T.E., Heirman, W., Eeckhout, L.: Sniper: exploring the level of abstraction for scalable and accurate parallel multi-core simulation. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC), Seattle, pp. 52:1–52:12 (Nov 2011)

    Google Scholar 

  3. Chen, J., Dabbiru, L.K., Wong, D., Annavaram, M., Dubois, M.: Adaptive and speculative slack simulations of CMPs on CMPs. In: Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), Atlanta, pp. 523–534 (Dec 2010)

    Google Scholar 

  4. Eyerman, S., Eeckhout, L., Karkhanis, T., Smith, J.E.: A mechanistic performance model for superscalar out-of-order processors. ACM Trans. Comput. Syst. (TOCS) 27(2), 42–53 (2009)

    Google Scholar 

  5. Eyerman, S., Eeckhout, L., Karkhanis, T., Smith, J.: A top-down approach to architecting CPI component performance counters. Micro, IEEE 27(1), 84–93 (2007)

    Article  Google Scholar 

  6. Eyerman, S., Smith, J., Eeckhout, L.: Characterizing the branch misprediction penalty. In: Proceedings of the 2006 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), Austin, pp. 48–58 (Apr 2006)

    Google Scholar 

  7. Fujimoto, R.M.: Parallel discrete event simulation. Commun. ACM 33(10), 30–53 (1990)

    Article  Google Scholar 

  8. Genbrugge, D., Eyerman, S., Eeckhout, L.: Interval simulation: raising the level of abstraction in architectural simulation. In: Proceedings of the 16th IEEE International Symposium on High-Performance Computer Architecture (HPCA), Bangalore, pp. 307–318 (Feb 2010)

    Google Scholar 

  9. Heirman, W., Sarkar, S., Carlson, T.E., Hur, I., Eeckhout, L.: Power-aware multi-core simulation for early design stage hardware/software co-optimization. In: Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT), Minneapolis, pp. 3–12 (Sept 2012)

    Google Scholar 

  10. Luk, C.K., Cohn, R., Muth, R., Patil, H., Klauser, A., Lowney, G., Wallace, S., Reddi, V.J., Hazelwood, K.: Pin: building customized program analysis tools with dynamic instrumentation. In: Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), Chicago, pp. 190–200 (June 2005)

    Google Scholar 

  11. Miller, J.E., Kasture, H., Kurian, G., Gruenwald III, C., Beckmann, N., Celio, C., Eastep, J., Agarwal, A.: Graphite: a distributed parallel simulator for multicores. In: Proceedings of the 16th IEEE International Symposium on High-Performance Computer Architecture (HPCA), Bangalore, pp. 1–12 (Jan 2010)

    Google Scholar 

  12. Patil, H., Pereira, C., Stallcup, M., Lueck, G., Cownie, J.: PinPlay: a framework for deterministic replay and reproducible analysis of parallel programs. In: Proceedings of the 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO), Toronto, pp. 2–11 (Apr 2010)

    Google Scholar 

  13. Reinhardt, S.K., Hill, M.D., Larus, J.R., Lebeck, A.R., Lewis, J.C., Wood, D.A.: The Wisconsin wind tunnel: virtual prototyping of parallel computers. In: Proceedings of the ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, Santa Clara, pp. 48–60 (May 1993)

    Google Scholar 

  14. Uzelac, V., Milenkovic, A.: Experiment flows and microbenchmarks for reverse engineering of branch predictor structures. In: Proceedings of the 2009 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), Boston, pp. 207–217 (Apr 2009)

    Google Scholar 

  15. Williams, S., Waterman, A., Patterson, D.A.: Roofline: an insightful visual performance model for multicore architectures. Commun. ACM 52(4), 65–76 (Apr 2009)

    Article  Google Scholar 

  16. Woo, S.C., Ohara, M., Torrie, E., Singh, J.P., Gupta, A.: The SPLASH-2 programs: characterization and methodological considerations. In: Proceedings of the 22th International Symposium on Computer Architecture (ISCA), Portofino, pp. 24–36 (June 1995)

    Google Scholar 

Download references

Acknowledgements

We thank Mathijs Rogiers for his invaluable work on the visualization features of Sniper and the anonymous reviewers for their valuable feedback. This work is supported by Intel and the Institute for the Promotion of Innovation through Science and Technology in Flanders (IWT). Additional support is provided by the European Research Council under the European Community’s Seventh Framework Programme (FP7/2007–2013) / ERC Grant agreement no. 259295. Experiments were run on computing infrastructure at the ExaScience Lab, Leuven, Belgium; the Intel HPC Lab, Swindon, UK; and the VSC Flemish Supercomputer Center.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Trevor E. Carlson .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Carlson, T.E., Heirman, W., Van Craeynest, K., Eeckhout, L. (2014). Node Performance and Energy Analysis with the Sniper Multi-core Simulator. In: Knüpfer, A., Gracia, J., Nagel, W., Resch, M. (eds) Tools for High Performance Computing 2013. Springer, Cham. https://doi.org/10.1007/978-3-319-08144-1_7

Download citation

Publish with us

Policies and ethics