Skip to main content
Log in

An Augmented k-ary Tree Multiprocessor with Real-Time Fault-Tolerant Capability

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

We present a real-time fault-tolerant design for an l-level k-ary tree multiprocessor and examine its reconfigurability. The k-ary tree is augmented by spare nodes and spare links. By utilizing the capabilities of wave-switching communication modules of the spare nodes, faulty nodes and faulty links can be tolerated. We consider two modes of operations. In the strict mode, the multiprocessor is under heavy computation or hard deadline and therefore we use a fast and local reconfiguration scheme to tolerate the faulty nodes. In the relaxed mode, where light computation or soft deadline is encountered, a global reconfiguration scheme is used to maximize the utilization of spare nodes, both in this mode as well as in the next strict mode. Both theoretical and simulation results are examined. Our simulation results, in the relaxed mode of operation, reveal that our approach can tolerate significantly more faulty nodes than other approaches, with a low overhead and no performance degradation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. J. Dongarra and D. Walker. The quest for petascale computing. IEEE Computing in Science and Engineering, 32–39, May 2001.

  2. S. Bhatt, F. Chung, F. Leighton, and A. Rosenberg. Efficient embedding of trees in hypercubes. SIAM Journal of Computing, 21(1):151–162, 1992.

    Google Scholar 

  3. K. Li. Determining the expected load of dynamic tree embedding in hypercubes. Proceedings of 17th International Conference on Distributed Computing Systems, pp. 508–515, 1997.

  4. S. Lee and H. Choi. Embedding of complete binary trees in meshes with row-column routing. IEEE Transactions on Parallel and Distributed Systems, 7(5):493–497, 1996.

    Google Scholar 

  5. C. E. Leiserson. The network architecture of the connection machine CM-5. In Proceedings of the 4th Annual ACM Symposium on Parallel Algorithms and Architectures, pp. 272–285, June 1992.

  6. Meiko World Incorporated. Computing Surface 2 Reference Manuals, Preliminary Edition, 1993.

  7. H. L. Muller, P. W. Stallard, and D. H. Warren. An evaluation study of a link-based data diffusion machine. In Proceedings of the 8th International Parallel Processing Symposium, pp. 115–128, April 1994.

  8. B. Izadi and F. Özgüner. Reconfigurable k-ary tree multiprocessors. International Journal of Parallel and Distributed Systems and Networks, 3(4): 227–234, 2000.

    Google Scholar 

  9. J. P. Hayes. A graph model for fault-tolerant computing systems. IEEE Transactions on Computers, c-25:875–884, September 1976.

    Google Scholar 

  10. C. L. Kwan and S. Toida. An optimal 2-FT realization of binary symmetric hierarchical tree systems. Networks, 12(12):231–239, 1982.

    Google Scholar 

  11. C. Raghavendra, A. Avizienis, and M. D. Ercegovac. Fault tolerance in binary tree architectures. IEEE Transactions on Computers, c-33:568–572, June 1984.

    Google Scholar 

  12. S. Dutt and J. Hayes. On designing and reconfiguring k-fault-tolerant tree architectures. IEEE Transactions on Computers, 39:490–503, April 1990.

    Google Scholar 

  13. M. B. Lowrie and W. K. Fuchs. Reconfigurable tree architecture using subtree oriented fault tolerance. IEEE Transactions on Computers, c-36:1172–1182, October 1987.

    Google Scholar 

  14. R. Libeskind-Hadas, N. Shrivastava, R. Melhem, and C. Liu. Optimal reconfiguration algorithms for real-time fault-tolerant processor arrays. IEEE Transactions on Parallel and Distributed Systems, 6:498–510, May 1995.

    Google Scholar 

  15. J. Duato, P. Lopez, and S. Yalamanchili. Deadlock-and livelock-free routing protocols for wave switching. In Proceedings of the 11th International Parallel Processing Symposium, pp. 570–577, April 1997.

  16. C. J. Colbourn. The Combinatorics of Network Reliability, Oxford University Press, 1987.

  17. B. Izadi. Design of fault-tolerant distributed memory multiprocessors. Ph.D. thesis, the Ohio State University, 1995.

  18. C. Y. Lee. An algorithm for path connection and its applications. IRE Transactions on Electronic Computers, ec-10:346–365, 1961.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Izadi, B.A., Özgüner, F. An Augmented k-ary Tree Multiprocessor with Real-Time Fault-Tolerant Capability. The Journal of Supercomputing 27, 5–17 (2004). https://doi.org/10.1023/A:1026235604866

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1026235604866

Navigation