Skip to main content

Loop Selection for Thread-Level Speculation

  • Conference paper
Languages and Compilers for Parallel Computing (LCPC 2005)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4339))

Abstract

Thread-level speculation (TLS) allows potentially dependent threads to speculatively execute in parallel, thus making it easier for the compiler to extract parallel threads. However, the high cost associated with unbalanced load, failed speculation, and inter-thread value communication makes it difficult to obtain the desired performance unless the speculative threads are carefully chosen.

In this paper, we focus on extracting parallel threads from loops in general-purpose applications because loops, with their regular structures and significant coverage on execution time, are ideal candidates for extracting parallel threads. General-purpose applications, however, usually contain a large number of nested loops with unpredictable parallel performance and dynamic behavior, thus making it difficult to decide which set of loops should be parallelized to improve overall program performance. Our proposed loop selection algorithm addresses all these difficulties. We have found that (i) with the aid of profiling information, compiler analyses can achieve a reasonably accurate estimation of the performance of parallel execution, and that (ii) different invocations of a loop may behave differently, and exploiting this dynamic behavior can further improve performance. With a judicious choice of loops, we can improve the overall program performance of SPEC2000 integer benchmarks by as much as 20%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Intel Pentium Processor Extreme Edition, http://www.intel.com/products/processor/pentiumXE/prodbrief.pdf

  2. Open Research Compiler for Itanium Processor Family, http://ipf-orc.sourceforge.net/

  3. Akkary, H., Driscoll, M.: A Dynamic Multithreading Processor. In: Proceedings of Micro-31 (December 1998)

    Google Scholar 

  4. Blume, B., Eigenmann, R., Faigin, K., Grout, J., Hoeflinger, J., Padua, D., Petersen, P., Pottenger, B., Rauchwerger, L., Tu, P., Weatherford, S.: Polaris: Improving the Effectiveness of Parallelizing Compilers. In: Pingali, K.K., Gelernter, D., Padua, D.A., Banerjee, U., Nicolau, A. (eds.) LCPC 1994. LNCS, vol. 892. Springer, Heidelberg (1995)

    Chapter  Google Scholar 

  5. Chen, M., Olukotun, K.: TEST: A Tracer for Extracting Speculative Threads. In: Proceedings of 2003 International Symposium on CGO (March 2003)

    Google Scholar 

  6. Cintra, M.H., Martínez, J.F., Torrellas, J.: Architectural support for scalable speculative parallelization in shared-memory multiprocessors. In: Proceedings of the ISCA (2000)

    Google Scholar 

  7. Colohan, C.B., Zhai, A., G.S.J., Mowry, T.C.: The Impact of Thread Size and Selection on the Performance of Thread-Level Speculation (in progress)

    Google Scholar 

  8. Du, D.Z., Pardalos, P.M.: Handbook of Combinatorial Optimization. Kluwer Academic Publishers, Dordrecht (1999)

    MATH  Google Scholar 

  9. Gopal, S., Vijaykumar, T., Smith, J., Sohi, G.: Speculative Versioning Cache. In: Proceedings of the 4th HPCA (February 1998)

    Google Scholar 

  10. Hall, M.W., Anderson, J.M., Amarasinghe, S.P., Murphy, B.R., Liao, S.-W., Bugnion, E., Lam, M.S.: Maximizing Multiprocessor Performance with the SUIF Compiler. IEEE Computer (12) (1999)

    Google Scholar 

  11. Hammond, L., Willey, M., Olukotun, K.: Data Speculation Support for A Chip Multiprocessor. In: Proceedings of ASPLOS-8 (October 1998)

    Google Scholar 

  12. Johnson, T.A., Eigenmann, R., Vijaykumar, T.N.: Min-Cut Program Decomposition for Thread-Level Speculation. In: Proceedings of PLDI (2004)

    Google Scholar 

  13. Kalla, R., Sinharoy, B., Tendler, J.M.: IBM Power5 Chip: a Dual-Core Multithreaded Processor. IEEE Micro. (2004) (2)

    Google Scholar 

  14. Kongetira, P., Aingaran, K., Olukotun, K.N.: A 32-Way Multithreaded Sparc Processor. IEEE Micro. (2005) (2)

    Google Scholar 

  15. Luk, C.-K., Cohn, R., Muth, R., Patil, H., Klauser, A., Lowney, G., Wallace, S., Reddi, V.J., Hazelwood, K.: Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation. In: Proceedings of the ACM Intl. Conf. on Programming Language Design and Implementation (June 2005)

    Google Scholar 

  16. Marcuello, P., Gonzlez, A.: Clustered Speculative Multithreaded Processors. In: Proceedings of MICRO-32 (November 1999)

    Google Scholar 

  17. Moshovos, A.I., Breach, S.E., Vijaykumar, T., Sohi, G.S.: Dynamic Speculation and Synchronization of Data Dependences. In: The Proceedings of the 24th ISCA (June 1997)

    Google Scholar 

  18. Olukotun, K., Hammond, L., Willey, M.: Improving the Performance of Speculatively Parallel Applications on the Hydra CMP. In: Proceedings of the ACM Int. Conf. on Supercomputing (June 1999)

    Google Scholar 

  19. Oplinger, J., Heine, D., Lam, M.S.: In Search of Speculative Thread-Level Parallelism. In: Malyshkin, V.E. (ed.) PaCT 1999. LNCS, vol. 1662. Springer, Heidelberg (1999)

    Google Scholar 

  20. Prabhu, M., Olukotun, K.: Exposing Speculative Thread Parallelism in SPEC 2000. In: Proceedings of the 9th ACM Symposium on Principles and Practice of Parallel Programming (2005)

    Google Scholar 

  21. Quinones, C.G., Madriles, C., Sanchez, J., Marcuello, P., González, A., Tullsen, D.M.: Mitosis Compiler: An Infrastructure for Speculative Threading Based on Pre-Computation Slices. In: Proceedings of the ACM Intl. Conf. on Programming Language Design and Implementation (June 2005)

    Google Scholar 

  22. Rauchwerger, L., Padua, D.A.: The LRPD Test: Speculative RunTime Parallelization of Loops with Privatization and Reduction Parallelization. IEEE Transactions on Parallel Distributed Systems (2), 160–180 (1999)

    Google Scholar 

  23. Renau, J., Tuck, J., Liu, W., Ceze, L., Strauss, K., Torrellas, J.: Tasking with Out-of-Order Spawn in TLS Chip Multiprocessors: Microarchitecture and Compilation. In: Proceeding of the 19th ACM International Conference on Supercomputing (2005)

    Google Scholar 

  24. Sohi, G.S., Breach, S.E., Vijaykumar, T.N.: Multiscalar Processors. In: Proceedings of the 22nd ISCA (June 1995)

    Google Scholar 

  25. Steffan, J.G., Colohan, C.B., Zhai, A., Mowry, T.C.: Improving Value Communication for Thread-Level Speculation. In: Proceedings of the 8th HPCA (February 2002)

    Google Scholar 

  26. Tsai, J.-Y., Huang, J., Amlo, C., Lilja, D., Yew, P.-C.: The Superthreaded Processor Architecture. IEEE Transactions on Computers (9) (1999)

    Google Scholar 

  27. Vijaykumar, T.N., Sohi, G.S.: Task Selection for a Multiscalar Processor. In: Proceeding of the 31st International Symposium on Microarchitecture (December 1998)

    Google Scholar 

  28. Zhai, A., Colohan, C.B., Steffan, J.G., Mowry, T.C.: Compiler Optimization of Memory- Resident Value Communication Between Speculative Threads. In: Proceedings of 2004 International Symposium on CGO (March 2004)

    Google Scholar 

  29. Zhai, A., Colohan, C.B., Steffan, J.G., Mowry, T.C.: Compiler Optimization of Scalar Value Communication Between Speculative Threads. In: Proceedings of the 10th ASPLOS (October 2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, S., Dai, X., Yellajyosula, K.S., Zhai, A., Yew, PC. (2006). Loop Selection for Thread-Level Speculation. In: Ayguadé, E., Baumgartner, G., Ramanujam, J., Sadayappan, P. (eds) Languages and Compilers for Parallel Computing. LCPC 2005. Lecture Notes in Computer Science, vol 4339. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69330-7_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-69330-7_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-69329-1

  • Online ISBN: 978-3-540-69330-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics