Abstract
This investigation deals with the reliability analysis of an embedded cluster system by incorporating the concept of software ageing which is the significant cause of fault occurrence in the system. To develop the Markov model, we consider four levels of software rejuvenation policies. The system state probabilities are evaluated which are further used to derive various indices such as the availability, mean time to failure, down time cost, etc. To validate the computational tractability of different formulae established, a numerical illustration has been provided. The sensitivity of reliability indices with respect to different parameters has also been examined.
Similar content being viewed by others
References
Asif M, Majumdar S, Kopec G (2007) Loading sharing in call server clusters. Comput Commun 30(16):3027–3045
Avritzer A, Bondi A, Weyuker EJ (2007) Ensuring system performance for cluster and single server systems. J Syst Softw 80(4):441–454
Clarke PJ, Babich D, King TM, Kibria BMG (2008) Analyzing clusters of class characteristics in applications. J Syst Softw 81(12):2269–2286
Day K, Arafeh B, Touzene A (2008) A unified fault-tolerant routing scheme for a class of cluster networks. J Syst Archit 54(8):757–768
Zhou X, Ippoliti D (2008) Resource allocation optimization for quantitative service differentiation on server clusters. J Parallel Distrib Comput 68(9):1250–1262
Boyer F, Palma ND, Gruber O, Sicard S, Stefani JB (2009) A self-repair architecture for cluster systems. In: de Lemos R, Fabre J-C, Gacek C, Gadducci F, ter Beek M (eds) Architecting dependable systems VI. Lecture notes in computer science 5835. Springer, Berlin, Heidelberg, pp 124–147. doi:10.1007/978-3-642-10248-6_6 (ISBN: 978-3-642-10248-6)
Ever E, Gemikonakli O, Chakka R (2009) Analytical modeling and simulation of small scale, typical and highly available beowulf clusters with breakdowns and repairs. Simul Model Pract Theory 17(2):327–347
Fu S (2010) Failure-aware resource management for high-availability computing clusters with distributed virtual machines. J Parallel Distrib Comput 70(4):384–393
Rechistov G, Ivanov A, Shishpor P, Pentkovski V (2012) Simulation and performance study of large scale computer cluster configuration: combined multi-level approach. In: Procedia computer science, international conference on computational science (ICCS), vol 9: pp 1–10
Wang KH, Yen TC, Jian JJ (2013) Reliability and sensitivity analysis of a repairable system with imperfect coverage under service pressure condition. J Manuf Syst 32(2):357–363
Abd-El-Barr M, Gebali F (2014) Reliability analysis and fault tolerance for hypercube multi-computer networks. Inf Sci 276(20):295–318
Bobbio A, Sereno M, Anglano C (2001) Fine gained software degradation models for optimal software rejuvenation policies. Perform Eval 46(1):45–62
Park K, Kim S (2002) Availability analysis and improvement of active/standby cluster system using software rejuvenation. J Syst Softw 61(2):121–128
Bao Y, Sun X, Trivedi KS (2005) A workload-based analysis of software aging and rejuvenation. IEEE Trans Reliab 54(3):541–548
Grottke M, Trivedi KS (2005) Software faults, software aging and software rejuvenation. J Reliab Eng Assoc 27(7):425–438
Xie W, Hong Y, Trivedi K (2005) Analysis of a two-level software rejuvenation policy. Reliab Eng Syst Saf 87(1):13–22
Koutras VP, Platis AN, Gravvanis GA (2007) On the optimization of free resources using non-homogeneous Markov chain software rejuvenation model. Reliab Eng Syst Saf 92(12):1724–1732
Koutras VP, Platis AN, Gravvanis GA (2007) Software rejuvenation for resource optimization based on explicit approximate inverse preconditiong. Appl Math Comput 189(1):163–177
Wang D, Xie W, Trivedi KS (2007) Performability analysis of clustered systems with rejuvenation under varying workload. Perform Eval 64(3):247–265
Okamura H, Dohi T (2010) Comprehensive evaluation of a periodic check pointing and rejuvenation schemes in operational software system. J Syst Softw 83(9):1591–1604
Avritzer A, Cole RG, Weyuker EJ (2010) Methods and opportunities for rejuvenation in aging distributed software systems. J Syst Softw 83(9):1568–1578
Salfner F, Wolter K (2010) Analysis of service availability for time-trigged rejuvenation policies. J Syst Softw 83(9):1579–1590
Okamura H, Dohi T (2011) Application of reinforcement learning to software rejuvenation. In: Tenth international symposium on autonomous decentralized system, 23–27 Mar 2011, Tokyo and Hiroshima pp 647–652
Paing AMM, Thein Nl (2012) Stochastic reward nets model for time based software rejuvenation in virtualized environment. Int J Comput Sci Telecommun 3(1):1–10
Grottke M, Schleich B (2013) How does testing affect the availability of aging software systems. Perform Eval 70(3):179–196
Barada S, Swain SK (2014) A survey report on software aging and rejuvenation studies in virtualized environment. Int J Comput Eng Technol (IJCSET) 5(5):541–546
Acknowledgments
MJ acknowledges the financial support from MHRD, Government of India.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Jain, M., Manjula, T. & Gulati, T.R. Software Rejuvenation Policies for Cluster System. Proc. Natl. Acad. Sci., India, Sect. A Phys. Sci. 86, 339–346 (2016). https://doi.org/10.1007/s40010-016-0273-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40010-016-0273-1