Skip to main content

Piecewise execution of nested data-parallel programs

  • Conference paper
  • First Online:
Languages and Compilers for Parallel Computing (LCPC 1995)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1033))

Abstract

The technique of flattening nested data parallelism combines all the independent operations in nested apply-to-all constructs and generates large amounts of potential parallelism for both regular and irregular expressions. However, the resulting data-parallel programs can have enormous memory requirements, limiting their utility. In this paper, we present piecewise execution, an automatic method of partially serializing data-parallel programs so that they achieve maximum parallelism within storage limitations. By computing large intermediate sequences in pieces, our approach requires asymptotically less memory to perform the same amount of work. By using characteristics of the underlying parallel architecture to drive the computation size, we retain effective use of a parallel machine at each step. This dramatically expands the class of nested data-parallel programs that can be executed using the flattening technique. With the addition of piecewise I/O operations, these techniques can be applied to generate out-of-core execution on large datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. P. Abrams. An APL Machine. PhD thesis, Stanford University, 1970.

    Google Scholar 

  2. J. Backus. Can programming be liberated from the von Neumann style? A functional style and its algebra of programs. Commun. ACM, 21(8):613–41, Aug. 1978.

    Article  Google Scholar 

  3. G. Blelloch and G. Narlikar. A comparison of two n-body algorithms. In Proceedings of DIMACS Parallel Implementation Challenge Workshop III, Oct. 1994.

    Google Scholar 

  4. G. Blelloch and G. Sabot. Compiling collection-oriented languages onto massively parallel computers. Journal of Parallel and Distributed Computing, 8(2), Feb. 1990.

    Google Scholar 

  5. G. E. Blelloch. Nesl: A nested data-parallel language. Technical Report CMU-CS-92-129, Carnegie Mellon University, 1992.

    Google Scholar 

  6. G. E. Blelloch, S. Chatterjee, J. Hardwick, M. Reid-Miller, J. Sipelstein, and M. Zagha. Cvl: a c vector library manual, version 2. Technical Report CMU-CS-93-114, Carnegie Mellon University, 1993.

    Google Scholar 

  7. T. Budd. An APL Compiler. Springer-Verlag, 1988.

    Google Scholar 

  8. S. Chatterjee. Compiling nested data-parallel programs for shared-memory multiprocessors. ACM Trans. Prog. Lang. Syst., 15(3):400–462, July 1993.

    Article  Google Scholar 

  9. H. P. F. Forum. High Performance Fortran language specification. Scientific Programming, 2(1–2): 1–170, 1993.

    Google Scholar 

  10. L. J. Guibas and D. K. Wyatt. Compilation and delayed evaluation in APL. In Conf. Record of the Fifth Annual ACM Symp. on Princ. of Prog. Lang. (Tucson, Arizona), pages 1–8. ACM, Jan. 1978.

    Google Scholar 

  11. P. Hatcher and M. Quinn. Data-Parallel Programming on MIMD Computers. MIT Press, 1991.

    Google Scholar 

  12. K. Iverson. A Programming Language. Wiley, 1962.

    Google Scholar 

  13. G. Levin and L. Nyland. An introduction to Proteus, version 0.9. Technical report, University of North Carolina at Chapel Hill, Aug. 1993.

    Google Scholar 

  14. D. W. Palmer. Dpl: Data-parallel library manual. Technical Report UNC-CS-93-064, University of North Carolina at Chapel Hill, Nov. 1993.

    Google Scholar 

  15. D. W. Palmer, J. F. Prins, and S. Westfold. Work-efficient nested data-parallelism. In Proc. Fifth Symp. on the Frontiers of Massively Parallel Processing (Frontiers 95). IEEE., 1995.

    Google Scholar 

  16. K. Pingali and Arvind. Efficient demand-driven evaluation. Part 1. ACM Trans. Prog. Lang. Syst., 7(2):311–33, Apr. 1985.

    Article  Google Scholar 

  17. K. Pingali and Arvind. Efficient demand-driven evaluation. Part 2. ACM Trans. Prog. Lang. Syst., 8(1):109–39, Jan. 1986.

    Article  Google Scholar 

  18. J. F. Prins and D. W. Palmer. Transforming high-level data-parallel programs into vector operations. In Proc. 4th PPOPP. (San Diego, CA, 19–22 May 1993). ACM., 1993. Published in SIGPLAN Notices, 28(7): 119–28.

    Google Scholar 

  19. J. Schwartz. Set theory as a language for program specification and programming. Technical report, Computer Science Department, Courant Institute of Mathematical Sciences, New York University, 1970.

    Google Scholar 

  20. R. Sethi. Complete register allocation problems. SIAM Journal of Computing, 4(3), 1975.

    Google Scholar 

  21. R. C. Waters. Automatic transformation of series expressions into loops. ACM Trans. Prog. Lang. Syst., 13(1):52–98, Jan. 1991.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Chua-Huang Huang Ponnuswamy Sadayappan Utpal Banerjee David Gelernter Alex Nicolau David Padua

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Palmer, D.W., Prins, J.F., Chatterjee, S., Faith, R.E. (1996). Piecewise execution of nested data-parallel programs. In: Huang, CH., Sadayappan, P., Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds) Languages and Compilers for Parallel Computing. LCPC 1995. Lecture Notes in Computer Science, vol 1033. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0014210

Download citation

  • DOI: https://doi.org/10.1007/BFb0014210

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-60765-6

  • Online ISBN: 978-3-540-49446-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics