Abstract
Many systems evolve incrementally and hence the need arises to retrofit reliability into existing and often very complex systems. Here we discuss some of the major options for performing that task without needing to recode the existing application from scratch.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
This uses the term somewhat loosely: a VPN, in platforms like Windows and Linux, is a fairly specific technology packaging focused on providing secure remote access to a corporate network by tunneling through the firewall using a shared-key cryptographic scheme. In contrast, here we are employing the same term to connote a more general idea of overlaying a network with “other properties” on a base network with “base properties.” Others might call this an overlay network—but, overlay networks, like VPNs, also have come to have a fairly specific meaning, associated with end-to-end implementations of routing. Rather than invent some completely new term, the book uses VPN in a generalized way.
References
Ahamad, M., Burns, J., Hutto, P., Neiger, G.: Causal memory. Technical Report, College of Computing, Georgia Institute of Technology, July (1991)
Alvisi, L., Bressoud, T., El-Khasab, A., Marzullo, K., Zagorodnov, D.: Wrapping server-side TCP to mask connection failures. In: INFOCOMM 2001, Anchorage, Alaska, 22–26 April 2001, vol. 1, pp. 329–337 (2001a)
Birman, K.P., Joseph, T.A.: Exploiting virtual synchrony in distributed systems. In: Proceedings of the Eleventh Symposium on Operating Systems Principles, Austin, November 1987, pp. 123–138. ACM Press, New York (1987a)
Birman, K.P., van Renesse, R. (eds.): Reliable Distributed Computing with the Isis Toolkit. IEEE Computer Society Press, New York (1994)
Birman, K.P., van Renesse, R.: Software for reliable networks. Sci. Am. 274(5), 64–69 (1996)
Borg, A., Baumbach, J., Glazer, S.: A message system for supporting fault tolerance. In: Proceedings of the Ninth Symposium on Operating Systems Principles, Bretton Woods, NH, October 1983, pp. 90–99 (1983)
Borg, A., et al.: Fault tolerance under UNIX. ACM Trans. Comput. Syst. 3(1), 1–23 (1985)
Bressoud, T.C., Schneider, F.B.: Hypervisor-based fault tolerance. In: Proceedings of the Fifteenth Symposium on Operating Systems Principles, Copper Mountain Resort, CO, December 1995, pp. 1–11. ACM Press, New York (1995). Also ACM Trans. Comput. Syst. 13(1) (1996)
Bykov, S., Geller, A., Kliot, G., Larus, J., Pandya, R., Thelin, J.: Orleans: Cloud computing for everyone. In: ACM Symposium on Cloud Computing (SOCC 2011), October 2011. ACM, New York (2011)
Carter, J.: Efficient distributed shared memory based on multi-protocol release consistency. Ph.D. diss., Rice University, August (1993)
Cho, K., Birman, K.P.: A group communication approach for mobile computing. Technical Report TR94-1424, Department of Computer Science, Cornell University, May (1994)
Cooper, E.: Replicated distributed programs. In: Proceedings of the Tenth ACM Symposium on Operating Systems Principles, Orcas Island, WA, December 1985, pp. 63–78. ACM Press, New York (1985)
Coulouris, G., Dollimore, J., Kindberg, T.: Distributed Systems: Concepts and Design. Addison-Wesley, Reading (1994)
Ekwall, R., Urbán, P., Schiper, A.: Robust TCP connections for fault tolerant computing. In: Proceedings of the 9th International Conference on Parallel and Distributed Systems (ICPDS), Taiwan ROC, Dec. 2002
Feeley, M., et al.: Implementing global memory management in a workstation cluster. In: Proceedings of the Fifteenth ACM SIGOPS Symposium on Operating Systems Principles, Copper Mountain Resort, CO, December 1995, pp. 201–212 (1995)
Felton, E., Zahorjan, J.: Issues in the implementation of a remote memory paging system. Technical Report 91-03-09, Department of Computer Science and Engineering, University of Washington, March (1991)
Gharachorloo, K., et al.: Memory consistency and event ordering in scalable shared-memory multiprocessors. In: Proceedings of the Seventeenth Annual International Symposium on Computer Architecture, Seattle, May 1990, pp. 15–26 (1990)
Gosling, J., McGilton, H.: The Java language environment: A white paper. Sun Microsystems, Inc., October (1995a). Available as http://java.sun.com/langEnv/index.html
Gosling, J., McGilton, H.: The Java programmer’s guide: A white paper. Sun Microsystems, Inc., October (1995b). Available as http://java.sun.com/progGuide/index.html
Johansen, H., Allavena, A., van Renesse, R.: An introduction to the TACOMA distributed system (Version 1.0). Computer Science Technical Report 95-23, University of Tromsö, June (1995a)
Johnson, K., Kaashoek, M.F., Wallach, D.: CRL: High-performance all software distributed shared memory. In: Proceedings of the Fifteenth ACM Symposium on Operating Systems Principles, Copper Mountain Resort, CO, December 1995, pp. 213–228 (1995)
Jones, M.B.: Interposition agents: Transparent interposing user code at the system interface. In: Proceedings of the Fourteenth ACM Symposium on Operating Systems Principles, Asheville, NC, December 1993, pp. 80–93. ACM Press, New York (1993)
Li, K., Hudak, P.: Memory coherence in a shared virtual memory system. ACM Trans. Comput. Syst. 7(4), 321–359 (1989)
Ousterhout, J.: TCL and the TK Toolkit. Addison-Wesley, Reading (1994)
Rozier, M., et al.: Chorus distributed operating system. Comput. Syst. J. 1(4), 305–370 (1988a)
Rozier, M., et al.: The Chorus distributed system. Comput. Syst. 299–328 (1988b)
Tanenbaum, A.: Computer Networks, 2nd edn. Prentice Hall, Englewood Cliffs (1988)
Wahbe, R., Lucco, S., Anderson, T., Graham, S.: Efficient software-based fault isolation. In: Proceedings of the Thirteenth ACM Symposium on Operating Systems Principles, Asheville, NC, December 1993, pp. 203–216. ACM Press, New York (1993)
Yu, Y., Isard, M., Fetterly, D., Budiu, M., Erlingsson, U., Gunda, P., Currey, J.: DryadLINQ: A system for general-purpose distributed data-parallel computing using a high-level language. In: ACM Symposium on Operating System Design and Implementation (OSDI), San Diego, CA, December 8–10, 2008
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag London Limited
About this chapter
Cite this chapter
Birman, K.P. (2012). Retrofitting Reliability into Complex Systems. In: Guide to Reliable Distributed Systems. Texts in Computer Science. Springer, London. https://doi.org/10.1007/978-1-4471-2416-0_16
Download citation
DOI: https://doi.org/10.1007/978-1-4471-2416-0_16
Publisher Name: Springer, London
Print ISBN: 978-1-4471-2415-3
Online ISBN: 978-1-4471-2416-0
eBook Packages: Computer ScienceComputer Science (R0)