Abstract
Cluster systems interconnected via fast interconnection networks have been successfully applied to various research fields for parallel execution of large applications. Next to MPI, the conventional programming model, OpenMP is increasingly used for parallelizing sequential codes. Due to its easy programming interface and similar semantics with traditional programming languages, OpenMP is especially appropriate for non-professional users.
For exploiting scalable parallel computation, we have established a PC cluster using InfiniBand, a high-performance, de facto standard interconnection technology. In order to support the users with a simple parallel programming model, we have implemented an OpenMP execution environment on top of this cluster. As a global memory abstraction is needed for shared data, we first built a software distributed shared memory implementing a kind of Home-based Lazy Release Consistency protocol. We then modified an existing OpenMP source-to-source compiler for mapping shared data on this DSM and for handling issues with respect to process/thread activities and task distribution. Experimental results based on a set of different OpenMP applications show a speedup of up to 5.22 on systems with 6 processor nodes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Basumallik, A., Min, S.-J., Eigenmann, R.: Towards OpenMP Execution on Software Distributed Shared Memory Systems. In: Zima, H.P., Joe, K., Sato, M., Seo, Y., Shimasaki, M. (eds.) ISHPC 2002. LNCS, vol. 2327, pp. 457–468. Springer, Heidelberg (2002)
Beltrametti, M., Bobey, K., Zorbas, J.R.: The Control Mechanism for the Myrias Parallel Computer System. ACM SIGARCH Computer Architecture News 16(4), 21–30 (1988)
Cox, A.L., Dwarkadas, S., Keleher, P.J., Lu, H., Rajamony, R., Zwaenepoel, W.: Software Versus Hardware Shared-Memory Implementation: A Case Study. In: Proceedings of the 21th Annual International Symposium on Computer Architecture, April 1994, pp. 106–117 (1994)
Bailey, D., et al.: The NAS Parallel Benchmarks. Technical Report RNR-94-007, Department of Mathematics and Computer Science, Emory University (March 1994)
Gonzàlez, M., Ayguadé, E., Martorell, X., Labarta, J., Navarro, N., Oliver, J.: NanosCompiler: Supporting Flexible Multilevel Parallelism in OpenMP. Concurrency: Practice and Experience 12(12), 1205–1218 (2000)
Iftode, L., Singh, J.P.: Shared Virtual Memory: Progress and Challenges. Proceedings of the IEEE, Special Issue on Distributed Shared Memory, 87, 498–507 (1999)
InfiniBand Trade Association. InfiniBand Architecture Specification, vol. 1 (November 2002)
Jin, H., Frumkin, M., Yan, J.: The OpenMP Implementation of NAS Parallel Benchmarks and Its Performance. Technical Report NAS-99-011, NASA Ames Research Center (October 1999)
Keleher, P., Dwarkadas, S., Cox, A., Zwaenepoel, W.: TreadMarks: Distributed Shared Memory On Standard Workstations and Operating Systems. In: Proceedings of the 1994 Winter Usenix Conference, January 1994, pp. 115–131 (1994)
Keleher, P.J.: Lazy Release Consistency for Distributed Shared Memory. PhD thesis, Department of Computer Science, Rice University (January 1995)
Kusano, K., Satoh, S., Sato, M.: Performance Evaluation of the Omni OpenMP Compiler. In: Valero, M., Joe, K., Kitsuregawa, M., Tanaka, H. (eds.) ISHPC 2000. LNCS, vol. 1940, pp. 403–414. Springer, Heidelberg (2000)
Lamport, L.: How to Make a Multiprocessor That Correctly Executes Multiprocess Programs. IEEE Transactions on Computers 28(9), 241–248 (1979)
Li, K.: Shared Virtual Memory on Loosely Coupled Multiprocessors. PhD thesis, Yale University (September 1986)
Li, K.: IVY: A Shared Virtual Memory System for Parallel Computing. In: Proceedings of the International Conference on Parallel Processing, Software, vol. II, pp. 94–101 (1988)
Martorell, X., Ayguadé, E., Navarro, N., Corbalán, J., González, M., Labarta, J.: Thread Fork/Join Techniques for Multi-Level Parallelism Exploitation in NUMA Multiprocessors. In: Proceedings of the 1999 International Conference on Supercomputing, Rhodes, Greece, June 1999, pp. 294–301 (1999)
Osendorfer, C., Tao, J., Trinitis, C., Mairandres, M.: ViSMI: Software Distributed Shared Memory for InfiniBand Clusters. In: Proceedings of the 3rd IEEE International Symposium on Network Computing and Applications (IEEE NCA 2004), September 2004, pp. 185–191 (2004)
Rangarajan, M., Iftode, L.: Software Distributed Shared Memory over Virtual Interface Architecture: Implementation and Performance. In: Proceedings of the 4th Annual Linux Showcase, Extreme Linux Workshop, Atlanta, USA, October 2000, pp. 341–352 (2000)
Sato, M., Harada, H., Hasegawa, A.: Cluster-enabled OpenMP: An OpenMP compiler for the SCASH software distributed shared memory system. Scientific Programming 9(2-3), 123–130 (2001)
Standish, R.K.: SMP vs Vector: A Head-to-head Comparison. In: Proceedings of the HPCAsia 2001 (September 2001)
Woo, S.C., Ohara, M., Torrie, E., Singh, J.P., Gupta, A.: The SPLASH-2 programs: characterization and methodological considerations. In: Proceedings of the 22nd Annual International Symposium on Computer Architecture, June 1995, pp. 24–36 (1995)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tao, J., Karl, W., Trinitis, C. (2008). Implementing an OpenMP Execution Environment on InfiniBand Clusters. In: Mueller, M.S., Chapman, B.M., de Supinski, B.R., Malony, A.D., Voss, M. (eds) OpenMP Shared Memory Parallel Programming. IWOMP 2005. Lecture Notes in Computer Science, vol 4315. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68555-5_6
Download citation
DOI: https://doi.org/10.1007/978-3-540-68555-5_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68554-8
Online ISBN: 978-3-540-68555-5
eBook Packages: Computer ScienceComputer Science (R0)