Optimizing Local Memory Allocation and Assignment through a Decoupled Approach

Diouf, Boubacar; Ozturk, Ozcan; Cohen, Albert

doi:10.1007/978-3-642-13374-9_29

Boubacar Diouf¹⁸,
Ozcan Ozturk¹⁹ &
Albert Cohen¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5898))

Included in the following conference series:

International Workshop on Languages and Compilers for Parallel Computing

822 Accesses
2 Citations

Abstract

Software-controlled local memories (LMs) are widely used to provide fast, scalable, power efficient and predictable access to critical data. While many studies addressed LM management, keeping hot data in the LM continues to cause major headache. This paper revisits LM management of arrays in light of recent progresses in register allocation, supporting multiple live-range splitting schemes through a generic integer linear program. These schemes differ in the grain of decision points. The model can also be extended to address fragmentation, assigning live ranges to precise offsets. We show that the links between LM management and register allocation have been underexploited, leaving much fundamental questions open and effective applications to be explored.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Fabri, J.: Automatic storage optimization. In: ACM Symp. on Compiler Construction, pp. 83–91 (1979)
Google Scholar
Appel, A.W., George, L.: Optimal spilling for CISC machines with few registers. In: PLDI 2001, Snowbird, Utah, USA, June 2001, pp. 243–253 (2001)
Google Scholar
Bouchez, F., Darte, A., Guillon, C., Rastello, F.: Register allocation: What does the NP-completeness proof of Chaitin et al. really prove? In: WDDD 2006, Boston, MA (2006)
Google Scholar
Hack, S., Grund, D., Goos, G.: Register allocation for programs in SSA-form. In: Mycroft, A., Zeller, A. (eds.) CC 2006. LNCS, vol. 3923, pp. 247–262. Springer, Heidelberg (2006)
Chapter Google Scholar
Bouchez, F., Darte, A., Guillon, C., Rastello, F.: Register allocation: What does the NP-completeness proof of Chaitin et al. really prove? or revisiting register allocation: Why and how. In: Almási, G.S., Caşcaval, C., Wu, P. (eds.) LCPC 2006. LNCS, vol. 4382, pp. 283–298. Springer, Heidelberg (2007)
Chapter Google Scholar
Quintáo Pereira, F.M., Palsberg, J.: Register allocation by puzzle solving. SIGPLAN Not. 43(6), 216–226 (2008)
Article Google Scholar
ARM: Document No. ARM DDI 0084D, ARM Ltd. ARM7TDMI-S data sheet (1998)
Google Scholar
Motorola: M-CORE – MMC 2001 reference manual, Motorola Corporation (1998)
Google Scholar
Instruments, T.: TMS370Cx7x 8-bit microcontroller, Texas Instruments (1997)
Google Scholar
NVIDIA: NVIDIA unified architecture GeForce 8800 GT (2008)
Google Scholar
Burns, M., Prier, G., Mirkovic, J., Reiher, P.: Implementing address assurance in the Intel IXP (2003)
Google Scholar
Kahle, J.A., Day, M.N., Hofstee, H.P., Johns, C.R., Maeurer, T.R., Shippy, D.: Introduction to the Cell multiprocessor. IBM Journal of Research and Development 49(4/5) (2005)
Google Scholar
Bouchez, F., Darte, A., Rastello, F.: On the complexity of register coalescing. In: CGO 2007 (2007)
Google Scholar
Bouchez, F., Darte, A., Rastello, F.: Advanced conservative and optimistic register coalescing. In: CASES 2008, pp. 147–156 (2008)
Google Scholar
Boissinot, B., Darte, A., de Dinechin, B.D., Guillon, C., Rastello, F.: Revisiting out-of-SSA translation for correctness, code quality and efficiency. In: CGO 2009, pp. 114–125 (2009)
Google Scholar
Udayakumaran, S., Barua, R.: Compiler-decided dynamic memory allocation for scratch-pad based embedded systems. In: CASES 2003, pp. 276–286 (2003)
Google Scholar
Udayakumaran, S., Dominguez, A., Barua, R.: Dynamic allocation for scratch-pad memory using compile-time decisions. ACM Trans. Embed. Comput. Syst. 5(2), 472–511 (2006)
Article Google Scholar
Kandemir, M., Ramanujam, J., Irwin, J., Vijaykrishnan, N., Kadayif, I., Parikh, A.: Dynamic management of scratch-pad memory space. In: DAC 2001, pp. 690–695 (2001)
Google Scholar
Li, L., Gao, L., Xue, J.: Memory coloring: A compiler approach for scratchpad memory management. In: PACT 2005, pp. 329–338 (2005)
Google Scholar
Lee, C.G.: UTDSP benchmarks (1998)
Google Scholar
Wolfe, M.J.: High Performance Compilers for Parallel Computing. Addison-Wesley Longman Publishing Co., Inc., Boston (1995)
Google Scholar
Dominguez, A., Nguyen, N., Barua, R.K.: Recursive function data allocation to scratch-pad memory. In: CASES 2007, pp. 65–74 (2007)
Google Scholar
Issenin, I., Brockmeyer, E., Miranda, M., Dutt, N.: DRDU: A data reuse analysis technique for efficient scratch-pad memory management. ACM Trans. Des. Autom. Electron. Syst. 12(2), 15 (2007)
Article Google Scholar
Avissar, O., Barua, R., Stewart, D.: An optimal memory allocation scheme for scratch-pad-based embedded systems. ACM Trans. Embed. Comput. Syst. 1(1), 6–26 (2002)
Article Google Scholar
De La Luz, V., Kandemir, M.: Array regrouping and its use in compiling data-intensive embedded applications. IEEE Trans. Comput. 53(1), 1–19 (2004)
Article Google Scholar
Kodukula, I., Ahmed, N., Pingali, K.: Data-centric multi-level blocking. In: PLDI 1997, Las Vegas, Nevada, June 1997, pp. 346–357 (1997)
Google Scholar
Knobe, K., Sarkar, V.: Array SSA form and its use in parallelization. In: ACM Conf. on Principles of Programming Languages (PoPL 2008), San Diego, CA, January 1998, pp. 107–120 (1998)
Google Scholar
Rus, S., He, G., Alias, C., Rauchwerger, L.: Region array SSA. In: PACT 2006, pp. 43–52 (2006)
Google Scholar
Belady, L.A.: A study of replacement algorithms for virtual storage computers. In: 9th Annual ACM-SIAM Symposium on Discrete Algorithms,
Google Scholar
Berry, M., et al.: The perfect club benchmarks: Effective performance evaluation of supercomputers. International Journal of Supercomputer Applications 3, 5–40 (1988)
Article Google Scholar

Download references

Author information

Authors and Affiliations

INRIA Saclay and Paris-Sud 11 University,
Boubacar Diouf & Albert Cohen
Bilkent University,
Ozcan Ozturk

Authors

Boubacar Diouf
View author publications
You can also search for this author in PubMed Google Scholar
Ozcan Ozturk
View author publications
You can also search for this author in PubMed Google Scholar
Albert Cohen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Electrical and Computer Engineering, University of Delaware, 19716, Newark, DE, USA
Guang R. Gao & Xiaoming Li &
Department of Computer and Information Sciences, University of Delaware, 19716, Newark, DE, USA
Lori L. Pollock & John Cavazos &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Diouf, B., Ozturk, O., Cohen, A. (2010). Optimizing Local Memory Allocation and Assignment through a Decoupled Approach. In: Gao, G.R., Pollock, L.L., Cavazos, J., Li, X. (eds) Languages and Compilers for Parallel Computing. LCPC 2009. Lecture Notes in Computer Science, vol 5898. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13374-9_29

Download citation

DOI: https://doi.org/10.1007/978-3-642-13374-9_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13373-2
Online ISBN: 978-3-642-13374-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics