Skip to main content

ÆminiumGPU: An Intelligent Framework for GPU Programming

  • Chapter
Facing the Multicore-Challenge III

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7686))

Abstract

As a consequence of the immense computational power available in GPUs, the usage of these platforms for running data-intensive general purpose programs has been increasing. Since memory and processor architectures of CPUs and GPUs are substantially different, programs designed for each platform are also very different and often resort to a very distinct set of algorithms and data structures. Selecting between the CPU or GPU for a given program is not easy as there are variations in the hardware of the GPU, in the amount of data, and in several other performance factors.

ÆminiumGPU is a new data-parallel framework for developing and running parallel programs on CPUs and GPUs. ÆminiumGPU programs are written in a Java using Map-Reduce primitives and are compiled into hybrid executables which can run in either platforms. Thus, the decision of which platform is going to be used for executing a program is delayed until run-time and automatically performed by the system using Machine-Learning techniques.

Our tests show that ÆminiumGPU is able to achieve speedups up to 65x and that the average accuracy of the platform selection algorithm, in choosing the best platform for executing a program, is above 92%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 49.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Stork, S., Marques, P., Aldrich, J.: Concurrency by default: using permissions to express dataflow in stateful programs. In: OOPSLA Companion, pp. 933–940 (2009)

    Google Scholar 

  2. Pawlak, R., Noguera, C., Petitprez, N.: Spoon: Program analysis and transformation in java (2006)

    Google Scholar 

  3. Harris, M.: Optimizing parallel reduction in cuda (2010)

    Google Scholar 

  4. Russell, T., Malik, A.M., Chase, M., van Beek, P.: Learning basic block scheduling heuristics from optimal data. In: Proceedings of the 2005 Conference of the Centre for Advanced Studies on Collaborative Research, CASCON 2005. IBM Press (2005)

    Google Scholar 

  5. Cavazos, J., Moss, J.E.B.: Inducing heuristics to decide whether to schedule. SIGPLAN Not. 39(6), 183–194 (2004)

    Article  Google Scholar 

  6. Wang, Z., O’Boyle, M.F.: Mapping parallelism to multi-cores: a machine learning based approach. In: Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2009, pp. 75–84. ACM, New York (2009)

    Google Scholar 

  7. Holmes, G., Donkin, A., Witten, I.: Weka: A machine learning workbench. In: Proceedings of the 1994 Second Australian and New Zealand Conference on Intelligent Information Systems 1994, pp. 357–361. IEEE (1994)

    Google Scholar 

  8. Frost, G.: Aparapi (2011), http://code.google.com/p/aparapi/

  9. Catanzaro, B., Garland, M., Keutzer, K.: Copperhead: Compiling an embedded data parallel language. In: Principles and Practices of Parallel Programming (PPoPP), pp. 47–56 (2011)

    Google Scholar 

  10. Chafik, O.: Scalacl (2011), http://code.google.com/p/scalacl/

  11. Cunningham, D., Bordawekar, R., Saraswat, V.: Gpu programming in a high level language: compiling x10 to cuda. In: Proceedings of the 2011 ACM SIGPLAN X10 Workshop, X10 2011, pp. 8:1–8:10. ACM, New York (2011)

    Chapter  Google Scholar 

  12. Chakravarty, M., Keller, G., Lee, S., McDonell, T., Grover, V.: Accelerating haskell array codes with multicore gpus. In: Proceedings of the Sixth Workshop on Declarative Aspects of Multicore Programming, pp. 3–14. ACM (2011)

    Google Scholar 

  13. Leung, A., Lhoták, O., Lashari, G.: Automatic parallelization for graphics processing units. In: Proceedings of the 7th International Conference on Principles and Practice of Programming in Java, pp. 91–100. ACM (2009)

    Google Scholar 

  14. He, B., Fang, W., Luo, Q., Govindaraju, N.K., Wang, T.: Mars: a mapreduce framework on graphics processors. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, PACT 2008, pp. 260–269. ACM, New York (2008)

    Chapter  Google Scholar 

  15. Hong, C., Chen, D., Chen, W., Zheng, W., Lin, H.: Mapcg: writing parallel program portable between cpu and gpu. In: Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, PACT 2010, pp. 217–226. ACM, New York (2010)

    Chapter  Google Scholar 

  16. Luk, C.K., Hong, S., Kim, H.: Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping. In: Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO, vol. 42, pp. 45–55. ACM, New York (2009)

    Chapter  Google Scholar 

  17. Joselli, M., Zamith, M., Clua, E., Montenegro, A., Conci, A., Leal-Toledo, R., Valente, L., Feijó, B., d’Ornellas, M., Pozzer, C.: Automatic dynamic task distribution between cpu and gpu for real-time systems. In: 11th IEEE International Conference on Computational Science and Engineering, CSE 2008. IEEE (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Fonseca, A., Cabral, B. (2013). ÆminiumGPU: An Intelligent Framework for GPU Programming. In: Keller, R., Kramer, D., Weiss, JP. (eds) Facing the Multicore-Challenge III. Lecture Notes in Computer Science, vol 7686. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35893-7_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35893-7_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35892-0

  • Online ISBN: 978-3-642-35893-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics