Skip to main content

A Membership Protocol Based on Partial Order

  • Conference paper
Dependable Computing for Critical Applications 2

Part of the book series: Dependable Computing and Fault-Tolerant Systems ((DEPENDABLECOMP,volume 6))

Abstract

Membership information is used to provide a consistent, system-wide view of which processes are currently functioning or failed in a distributed computation. This paper describes a membership protocol that is used to maintain this information. Our protocol is novel because it is based on a multicast facility that preserves only the partial order of messages exchanged among the communicating processes. Because it depends only on a partial ordering of messages rather than a total ordering, our protocol requires less synchronization overhead. The advantages of our approach are especially pronounced if multiple failures occur concurrently.

This work supported in part by the National Science Foundation under grants CCR-8811923 and CCR-9003161, and the Office of Naval Research under grant N00014-91-J-1015.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. F. Cristian, “Probabilistic clock synchronization,” in Ninth International Symposium on DCS, (Newport Beach, CA), pp. 288-296, Jun 1989.

    Google Scholar 

  2. J. Y. Halpern, B. Simons, R. Strong, and D. Dolev, “Fault-tolerant clock synchronization,” in Third ACM Symposium on PODC, (Vancouver, Canada), pp. 89-102, Aug 1984.

    Google Scholar 

  3. H. Kopetz and W. Ochsenreiter, “Clock synchronizatin in distributed, realtime systems,” IEEE Transactions on Computers, vol. C-36, pp. 933–940, Aug 1987.

    Article  Google Scholar 

  4. K. Birman and K. Marzullo, “The role of order in distributed programs,” Tech. Rep. 89-1001, Department of Computer Science, Cornell University, 1989.

    Google Scholar 

  5. H. Garcia-Molina and A. Spauster, “Message ordering in a multicast environment,” in Ninth International Conference on DCS, (Newport Beach, CA), pp. 354-361, Jun 1989.

    Google Scholar 

  6. P. Kearns and B. Koodalattupuram, “Immediate ordered service in distributed systems,” in Ninth International Conference on DCS, (Newport Beach, CA), pp. 611-618, Jun 1989.

    Google Scholar 

  7. L. Lamport, “Time, clocks, and the ordering of events in a distributed system,” Communications of the ACM, vol. 21, pp. 558–565, July 1978.

    Article  MATH  Google Scholar 

  8. F. Cristian, “Agreeing on who is present and who is absent in a synchronous distributed system,” in Eighteenth FTCS, (Tokyo), pp. 206-211, Jun 1988.

    Google Scholar 

  9. H. Garcia-Molina, “Elections in a distributed computing system,” IEEE Transactions on Computers, vol. C-31, pp. 49–59, Jan 1982.

    Article  Google Scholar 

  10. H. Kopetz, G. Grunsteidl, and J. Reisinger, “Fault-tolerant membership service in a synchronous distributed real-time system,” in International Working Conference on Dependable Computing for Critical Applications, (Santa Barbara, California), pp. 167-174, Aug 1989.

    Google Scholar 

  11. P. Verissimo and J. Marques, “Reliable broadcast for fault-tolerance on local computer networks,” in Ninth IEEE Symposium on Reliable Distributed Systems, pp. 54-63, Oct. 1990.

    Google Scholar 

  12. K. Birman and T. Joseph, “Reliable communication in the presence of failures,” ACM Transactions on Computer Systems, vol. 5, pp. 47–76, Feb. 1987.

    Article  Google Scholar 

  13. J. Chang and N. Maxemchuk, “Reliable broadcast protocols,” ACM Transactions on Computer Systems, vol. 2, pp. 251–273, Aug. 1984.

    Article  Google Scholar 

  14. L. L. Peterson, N. Buchholz, and R. D. Schlichting, “Preserving and using context information in interprocess communication,” ACM Transactions on Computer Systems, vol. 7, pp. 217–246, Aug. 1989.

    Article  Google Scholar 

  15. S. Mishra, L. L. Peterson, and R. D. Schlichting, “Implementing fault-tolerant replicated objects using Psync,” in Eighth IEEE Symposium on Reliable Distributed Systems, pp. 42-52, Oct. 1989.

    Google Scholar 

  16. N. C. Hutchinson, L. L. Peterson, M. Abbott, and S. O’Malley, “RPC in the x-Kernel: Evaluating new design techniques,” in Proceedings of the Twelfth ACM Symposium on Operating System Principles, pp. 91-101, Dec. 1989.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1992 Springer-Verlag/Wien

About this paper

Cite this paper

Mishra, S., Peterson, L.L., Schlichting, R.D. (1992). A Membership Protocol Based on Partial Order. In: Meyer, J.F., Schlichting, R.D. (eds) Dependable Computing for Critical Applications 2. Dependable Computing and Fault-Tolerant Systems, vol 6. Springer, Vienna. https://doi.org/10.1007/978-3-7091-9198-9_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-7091-9198-9_15

  • Publisher Name: Springer, Vienna

  • Print ISBN: 978-3-7091-9200-9

  • Online ISBN: 978-3-7091-9198-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics