Abstract
This work presents a GPU-based backtracking algorithm for permutation combinatorial problems based on the Integer-Vector-Matrix (IVM) data structure. IVM is a data structure dedicated to permutation combinatorial optimization problems. In this algorithm, the load balancing is performed without intervention of the CPU, inside a work stealing phase invoked after each node expansion phase. The proposed work stealing approach uses a virtual n-dimensional hypercube topology and a triggering mechanism to reduce the overhead incurred by dynamic load balancing. We have implemented this new algorithm for solving instances of the Asymmetric Travelling Salesman Problem by implicit enumeration, a scenario where the cost of node evaluation is low, compared to the overall search procedure. Experimental results show that the dynamically load balanced IVM-algorithm reaches speed-ups up to 17\(\times \) over a serial implementation using a bitset-data structure and up to 2\(\times \) over its GPU counterpart.
T.C. Pessoa was partially supported by the Institutional Program of Overseas Sandwich Doctorate (PDSE-CAPES) grant 3376/2015-00.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Burtscher, M., Nasre, R., Pingali, K.: A quantitative study of irregular programs on GPUs. In: 2012 IEEE International Symposium on Workload Characterization (IISWC), pp. 141–151. IEEE (2012)
Carneiro, T., Muritiba, A., Negreiros, M., de Campos, G.: A new parallel schema for branch-and-bound algorithms using GPGPU. In: 23rd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), pp. 41–47 (2011)
Carneiro, T., Nobre, R.H., Negreiros, M., de Campos, G.A.L.: Depth-first search versus jurema search on GPU branch-and-bound algorithms: a case study. In: NVIDIA’s GCDF - GPU Computing Developer Forum on XXXII Congresso da Sociedade Brasileira de Computação (CSBC) (2012)
Cirasella, J., Johnson, D.S., McGeoch, L.A., Zhang, W.: The asymmetric traveling salesman problem: algorithms, instance generators, and tests. In: Buchsbaum, A.L., Snoeyink, J. (eds.) ALENEX 2001. LNCS, vol. 2153, pp. 32–59. Springer, Heidelberg (2001). doi:10.1007/3-540-44808-X_3
Cook, W.: In Pursuit of the Traveling Salesman: Mathematics at the Limits of Computation. Princeton University Press, Princeton (2012)
Defour, D., Marin, M.: Regularity versus load-balancing on GPU for treefix computations. Procedia Comput. Sci. 18, 309–318 (2013)
Feinbube, F., Rabe, B., von Lowis, M., Polze, A.: NQueens on CUDA: optimization issues. In: 2010 Ninth International Symposium on Parallel and Distributed Computing (ISPDC), pp. 63–70. IEEE (2010)
Gmys, J., Mezmaz, M., Melab, N., Tuyttens, D.: A GPU-based Branch-and-Bound algorithm using Integer–Vector–Matrix data structure. Parallel Comput. (2016). http://www.sciencedirect.com/science/article/pii/S0167819116000387
Jenkins, J., Arkatkar, I., Owens, J.D., Choudhary, A., Samatova, N.F.: Lessons learned from exploring the backtracking paradigm on the GPU. In: Jeannot, E., Namyst, R., Roman, J. (eds.) Euro-Par 2011. LNCS, vol. 6853, pp. 425–437. Springer, Heidelberg (2011). doi:10.1007/978-3-642-23397-5_42
Karp, R.M., Zhang, Y.: Randomized parallel algorithms for backtrack search and branch-and-bound computation. J. ACM (JACM) 40(3), 765–789 (1993)
Karypis, G., Kumar, V.: Unstructured tree search on SIMD parallel computers. IEEE Trans. Parallel Distrib. Syst. 5(10), 1057–1072 (1994)
Knuth, D.: The Art of Computer Programming. Seminumerical Algorithms, vol. 2, p. 192. Addison-Wesley, Reading (1997). iSBN=9780201896848
Li, L., Liu, H., Wang, H., Liu, T., Li, W.: A parallel algorithm for game tree search using GPGPU. IEEE Trans. Parallel Distrib. Syst. 26(8), 2114–2127 (2015)
Mezmaz, M., Leroy, R., Melab, N., Tuyttens, D.: A multi-core parallel branch-and-bound algorithm using factorial number system. In: 28th IEEE International Parallel & Distributed Processing Symposium (IPDPS), Phoenix, AZ, pp. 1203–1212, May 2014
Plauth, M., Feinbube, F., Schlegel, F., Polze, A.: Using dynamic parallelism for fine-grained, irregular workloads: a case study of the n-queens problem. In: 2015 Third International Symposium on Computing and Networking (CANDAR), pp. 404–407. IEEE (2015)
Rocki, K., Suda, R.: Parallel minimax tree searching on GPU. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Wasniewski, J. (eds.) PPAM 2009. LNCS, vol. 6067, pp. 449–456. Springer, Heidelberg (2010). doi:10.1007/978-3-642-14390-8_47
San Segundo, P., Rossi, C., Rodriguez-Losada, D.: Recent Developments in Bit-Parallel Algorithms. INTECH Open Access Publisher (2008)
Yelick, K.A.: Programming models for irregular applications. ACM SIGPLAN Not. 28(1), 28–31 (1993)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Pessoa, T.C., Gmys, J., Melab, N., de Carvalho Junior, F.H., Tuyttens, D. (2016). A GPU-Based Backtracking Algorithm for Permutation Combinatorial Problems. In: Carretero, J., Garcia-Blas, J., Ko, R., Mueller, P., Nakano, K. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2016. Lecture Notes in Computer Science(), vol 10048. Springer, Cham. https://doi.org/10.1007/978-3-319-49583-5_24
Download citation
DOI: https://doi.org/10.1007/978-3-319-49583-5_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49582-8
Online ISBN: 978-3-319-49583-5
eBook Packages: Computer ScienceComputer Science (R0)