Abstract
A distributed system employing checkpoint and rollback-recovery as a fault tolerance mechanism, suffers from overhead attributed by the technique. Authors in [4] proposes a technique to automatically identify a checkpoint and recovery protocol based on a pre-estimated database of overhead measures. The technique depends on computation of similarity between a pair of communication patterns. The computation involves first partitioning both the communication patterns into small pieces or splices. A pair of splices, one taken from each of the two communication patterns in question, are then compared to compute a similarity measure. Splicing a communication pattern is an important step in the method since it bears heavy significance for later steps in the computation. This paper introduces a new method for splicing. Experimental results show that the technique yields better similarity measure values in comparison to results reported in [4].
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Lamport, L.: Time, clocks, and the ordering of events in a distributed system. Communications of the ACM 21(7), 558–565 (1978)
Elnozahy, E.N., Alvisi, L., Wang, Y., Johnson, D.B.: A survey of rollbak-recovery protocols in message-passing sytems. ACM Computing Surveys 34(3), 375–408 (2002)
Netzer, R.H.B., Xu, J.: Necessary and sufficient conditions for consistent global snapshots. IEEE Transactions on Parallel and Distributed Systems 6(2), 165–169 (1995)
Paul, H.S., Gupta, A., Sharma, A.: Finding a suitable checkpoint and recovery protocol for a distributed application. J. Parallel and Distributed Computing 66(5), 732–749 (2006)
Paul, H.S., Gupta, A., Badrinath, R.: Performance comparison of checkpoint and recovery protocols. Concurrency and Computation: Practice and Experience 15(15), 1363–1386 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Paul, H.S. (2010). Causal Cycle Based Communication Pattern Matching. In: Kant, K., Pemmaraju, S.V., Sivalingam, K.M., Wu, J. (eds) Distributed Computing and Networking. ICDCN 2010. Lecture Notes in Computer Science, vol 5935. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11322-2_28
Download citation
DOI: https://doi.org/10.1007/978-3-642-11322-2_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11321-5
Online ISBN: 978-3-642-11322-2
eBook Packages: Computer ScienceComputer Science (R0)