Reducing Misspeculation Penalty in Trace-Level Speculative Multithreaded Architectures

Molina, Carlos; Tubella, Jordi; González, Antonio

doi:10.1007/978-3-540-77704-5_4

Carlos Molina¹,
Jordi Tubella² &
Antonio González^2,3

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4759))

Included in the following conference series:

786 Accesses

Abstract

Trace-Level Speculative Multithreaded Processors exploit trace-level speculation by means of two threads working cooperatively. One thread, called the speculative thread, executes instructions ahead of the other by speculating on the result of several traces. The other thread executes speculated traces and verifies the speculation made by the first thread. Speculated traces are validated by verifying their live-output values. Every time a trace misspeculation is detected, a thread synchronization is fired. This recovery action involves flushing the pipeline and reverting to a safe point in a program, which results in some performance penalties. This paper proposes a new thread synchronization scheme based on the observation that a significant number of instructions whose control and data are independent of the mispredicted instruction. This scheme significantly increases the performance potential of the architecture at less cost. Our experimental results show that the mechanism cuts the number of executed instructions by 8% and achieves on average speed-up of almost 9% for a collection of SPEC2000 benchmarks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ahuja, P.S., Skadron, K., Martonosi, M., Clark, D.W.: Multipath Execution: Opportunities and Limits. In: Proceedings of the International Symposium on Supercomputing (1998)
Google Scholar
Akkary, H., Driscoll, M.: A Dynamic Multithreaded Processor. In: Proceedings of the 31st Annual International Symposium on Microarchitecture (1998)
Google Scholar
Balasubramonian, R., Dwarkadas, S., Albonesi, D.: Dynamically Allocating Processor Resources between Nearby and Distant ILP. In: Proceedings of the 28th International Symposium on Computer Architecture (2001)
Google Scholar
Burger, D., Austin, T.M., Bennet, S.: Evaluating Future Microprocessors: The SimpleScalar Tool Set. Technical Report CS-TR-96-1308. Univ. of Wisconsin (1996)
Google Scholar
Chaterjee, S., Weaver, C., Austin, T.: Efficient Checker Processor Design. In: Proceedings of the 33rd Annual International Symposium on Microarchitecture (2000)
Google Scholar
Chou, Y., Fung, J., Shen, J.: Reducing Branch Missprediction Penalties Via Dynamic Control Independence Detection. In: Proceedings of International Conference on Supercomputing (1999)
Google Scholar
Collins, J., Wang, H., Tullsen, D., Hughes, C., Lee, Y., Lavery, D., Shen, J.: Speculative Precomputation: Long-range Prefetching of Delinquent Loads. In: Proceedings of the 28th International Symposium on Computer Architecture (2001)
Google Scholar
Connors, D.A., Hwu, W.W.: Compiler-Directed Dynamic Computation Reuse: Rationale and Initial Results. In: Proceedings of the 32nd Annual International Symposium on Microarchitecture (1999)
Google Scholar
Gonzalez, A., Tubella, J., Molina, C.: Trace Level Reuse. In: Proceedings of the International Conference on Parallel Processing (1999)
Google Scholar
Gonzalez, J., Gonzalez, A.: Speculative Execution via Address Prediction and Data Prefetching. In: Proceedings of the 11th International Conf. on Supercomputing (1997)
Google Scholar
Huang, J., Lilja, D.: Exploiting Basic Block Value Locality with Block Reuse. In: Proceedings of the 5th International Symposium on High-Performance Computer Architecture (1999)
Google Scholar
Lipasti, M.H.: Value Locality and Speculative Execution, Ph.D. Dissertation, department of Electrical and Computer Engineering, Carnegie Mellon Univ. (April 1997)
Google Scholar
Marcuello, P., Tubella, J., Gonzalez, A.: Value Prediction for Speculative Multithreaded Architectures. In: Proceedings of the 32th Annual International Symposium on Microarchitecture (1999)
Google Scholar
Molina, C., Tubella, J., Gonzalez, A.: Trace-Level Speculative Multithreaded Architecture. In: Procs of the International Conference on Computer Design (2002)
Google Scholar
Molina, C., Tubella, J., Gonzalez, A.: Compiler Analysis to Support Trace-Level Speculative Multithreaded Architectures. In: Proceedings of the 9th Annual Workshop on Interaction between Compilers and Computer Architectures (2005)
Google Scholar
Purser, Z., Sundaramoorthy, K., Rotenberg, E.: A Study of Slipstream Processors. In: Proceedings of the 33rd International Symposium on Microarchitecture (2000)
Google Scholar
Rotenberg, E.: AR-SMT: A Microarchitectural Approach to Fault Tolerance in Microprocessors. In: Procs of the 29th Fault-Tolerant Computing Symposium (1999)
Google Scholar
Rotenberg, E.: Exploiting Large Ineffectual Instruction Sequences. Technical Report, North Carolina State University (November 1999)
Google Scholar
Roth, A., Sohi, G.S.: Register Integration: A Simple and Efficient Implementation of Squash Reuse. In: Proceedings of the 33rd International Symposium on Microarchitecture (2000)
Google Scholar
Roth, A., Sohi, G.: Speculative Data-Driven Multithreading. In: Proceedings of the 7th International Symposium on High-Performance Computer Architecture (2001)
Google Scholar
Sato, T., Arita, I.: Comprehensive Evaluation of an Instruction Reissue Mechanism. In: Proceedings of the 5th International Symposium on Parallel Architectures, Algorithms and Networks (2000)
Google Scholar
Sazeides, Y., Smith, J.E.: The Predictability of Data Values. In: Proceedings of the 30th International Symposium on Microarchitecture (1997)
Google Scholar
Sodani, A., Sohi, G.S.: Dynamic Instruction Reuse. In: Proceedings of the 24th International Symposium on Computer Architecture (1997)
Google Scholar
Tullsen, D.M., Eggers, S.J., Levy, H.M.: Simultaneous Multithreading: Maximizing on- chip Parallelism. In: Proceedings of the 22th Annual International Symposium on Computer Architecture (1995)
Google Scholar
Tyson, G.S., Austin, T.M.: Improving the Accuracy and Performance of Memory Communication Through Renaming. In: Proceedings of the 30th Annual Symposium on Microarchitecture (1997)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. Eng. Informàtica i Matemàtiques, Universitat Rovira i Virgili, Tarragona, Spain
Carlos Molina
Dept. d’Arquitectura de Computadors, Universitat Politècnica de Catalunya, Barcelona, Spain
Jordi Tubella & Antonio González
Intel Barcelona Research Center, Intel Labs-UPC, Barcelona, Spain
Antonio González

Authors

Carlos Molina
View author publications
You can also search for this author in PubMed Google Scholar
Jordi Tubella
View author publications
You can also search for this author in PubMed Google Scholar
Antonio González
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Jesús Labarta Kazuki Joe Toshinori Sato

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Molina, C., Tubella, J., González, A. (2008). Reducing Misspeculation Penalty in Trace-Level Speculative Multithreaded Architectures. In: Labarta, J., Joe, K., Sato, T. (eds) High-Performance Computing. ISHPC ALPS 2005 2006. Lecture Notes in Computer Science, vol 4759. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77704-5_4

Download citation

DOI: https://doi.org/10.1007/978-3-540-77704-5_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-77703-8
Online ISBN: 978-3-540-77704-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics