Skip to main content

Reducing Misspeculation Penalty in Trace-Level Speculative Multithreaded Architectures

  • Conference paper
High-Performance Computing (ISHPC 2005, ALPS 2006)

Abstract

Trace-Level Speculative Multithreaded Processors exploit trace-level speculation by means of two threads working cooperatively. One thread, called the speculative thread, executes instructions ahead of the other by speculating on the result of several traces. The other thread executes speculated traces and verifies the speculation made by the first thread. Speculated traces are validated by verifying their live-output values. Every time a trace misspeculation is detected, a thread synchronization is fired. This recovery action involves flushing the pipeline and reverting to a safe point in a program, which results in some performance penalties. This paper proposes a new thread synchronization scheme based on the observation that a significant number of instructions whose control and data are independent of the mispredicted instruction. This scheme significantly increases the performance potential of the architecture at less cost. Our experimental results show that the mechanism cuts the number of executed instructions by 8% and achieves on average speed-up of almost 9% for a collection of SPEC2000 benchmarks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ahuja, P.S., Skadron, K., Martonosi, M., Clark, D.W.: Multipath Execution: Opportunities and Limits. In: Proceedings of the International Symposium on Supercomputing (1998)

    Google Scholar 

  2. Akkary, H., Driscoll, M.: A Dynamic Multithreaded Processor. In: Proceedings of the 31st Annual International Symposium on Microarchitecture (1998)

    Google Scholar 

  3. Balasubramonian, R., Dwarkadas, S., Albonesi, D.: Dynamically Allocating Processor Resources between Nearby and Distant ILP. In: Proceedings of the 28th International Symposium on Computer Architecture (2001)

    Google Scholar 

  4. Burger, D., Austin, T.M., Bennet, S.: Evaluating Future Microprocessors: The SimpleScalar Tool Set. Technical Report CS-TR-96-1308. Univ. of Wisconsin (1996)

    Google Scholar 

  5. Chaterjee, S., Weaver, C., Austin, T.: Efficient Checker Processor Design. In: Proceedings of the 33rd Annual International Symposium on Microarchitecture (2000)

    Google Scholar 

  6. Chou, Y., Fung, J., Shen, J.: Reducing Branch Missprediction Penalties Via Dynamic Control Independence Detection. In: Proceedings of International Conference on Supercomputing (1999)

    Google Scholar 

  7. Collins, J., Wang, H., Tullsen, D., Hughes, C., Lee, Y., Lavery, D., Shen, J.: Speculative Precomputation: Long-range Prefetching of Delinquent Loads. In: Proceedings of the 28th International Symposium on Computer Architecture (2001)

    Google Scholar 

  8. Connors, D.A., Hwu, W.W.: Compiler-Directed Dynamic Computation Reuse: Rationale and Initial Results. In: Proceedings of the 32nd Annual International Symposium on Microarchitecture (1999)

    Google Scholar 

  9. Gonzalez, A., Tubella, J., Molina, C.: Trace Level Reuse. In: Proceedings of the International Conference on Parallel Processing (1999)

    Google Scholar 

  10. Gonzalez, J., Gonzalez, A.: Speculative Execution via Address Prediction and Data Prefetching. In: Proceedings of the 11th International Conf. on Supercomputing (1997)

    Google Scholar 

  11. Huang, J., Lilja, D.: Exploiting Basic Block Value Locality with Block Reuse. In: Proceedings of the 5th International Symposium on High-Performance Computer Architecture (1999)

    Google Scholar 

  12. Lipasti, M.H.: Value Locality and Speculative Execution, Ph.D. Dissertation, department of Electrical and Computer Engineering, Carnegie Mellon Univ. (April 1997)

    Google Scholar 

  13. Marcuello, P., Tubella, J., Gonzalez, A.: Value Prediction for Speculative Multithreaded Architectures. In: Proceedings of the 32th Annual International Symposium on Microarchitecture (1999)

    Google Scholar 

  14. Molina, C., Tubella, J., Gonzalez, A.: Trace-Level Speculative Multithreaded Architecture. In: Procs of the International Conference on Computer Design (2002)

    Google Scholar 

  15. Molina, C., Tubella, J., Gonzalez, A.: Compiler Analysis to Support Trace-Level Speculative Multithreaded Architectures. In: Proceedings of the 9th Annual Workshop on Interaction between Compilers and Computer Architectures (2005)

    Google Scholar 

  16. Purser, Z., Sundaramoorthy, K., Rotenberg, E.: A Study of Slipstream Processors. In: Proceedings of the 33rd International Symposium on Microarchitecture (2000)

    Google Scholar 

  17. Rotenberg, E.: AR-SMT: A Microarchitectural Approach to Fault Tolerance in Microprocessors. In: Procs of the 29th Fault-Tolerant Computing Symposium (1999)

    Google Scholar 

  18. Rotenberg, E.: Exploiting Large Ineffectual Instruction Sequences. Technical Report, North Carolina State University (November 1999)

    Google Scholar 

  19. Roth, A., Sohi, G.S.: Register Integration: A Simple and Efficient Implementation of Squash Reuse. In: Proceedings of the 33rd International Symposium on Microarchitecture (2000)

    Google Scholar 

  20. Roth, A., Sohi, G.: Speculative Data-Driven Multithreading. In: Proceedings of the 7th International Symposium on High-Performance Computer Architecture (2001)

    Google Scholar 

  21. Sato, T., Arita, I.: Comprehensive Evaluation of an Instruction Reissue Mechanism. In: Proceedings of the 5th International Symposium on Parallel Architectures, Algorithms and Networks (2000)

    Google Scholar 

  22. Sazeides, Y., Smith, J.E.: The Predictability of Data Values. In: Proceedings of the 30th International Symposium on Microarchitecture (1997)

    Google Scholar 

  23. Sodani, A., Sohi, G.S.: Dynamic Instruction Reuse. In: Proceedings of the 24th International Symposium on Computer Architecture (1997)

    Google Scholar 

  24. Tullsen, D.M., Eggers, S.J., Levy, H.M.: Simultaneous Multithreading: Maximizing on- chip Parallelism. In: Proceedings of the 22th Annual International Symposium on Computer Architecture (1995)

    Google Scholar 

  25. Tyson, G.S., Austin, T.M.: Improving the Accuracy and Performance of Memory Communication Through Renaming. In: Proceedings of the 30th Annual Symposium on Microarchitecture (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Jesús Labarta Kazuki Joe Toshinori Sato

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Molina, C., Tubella, J., González, A. (2008). Reducing Misspeculation Penalty in Trace-Level Speculative Multithreaded Architectures. In: Labarta, J., Joe, K., Sato, T. (eds) High-Performance Computing. ISHPC ALPS 2005 2006. Lecture Notes in Computer Science, vol 4759. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77704-5_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-77704-5_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-77703-8

  • Online ISBN: 978-3-540-77704-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics