Skip to main content

Scheduling Fault Recovery Operations for Time-Critical Applications

  • Conference paper
Dependable Computing for Critical Applications 4

Part of the book series: Dependable Computing and Fault-Tolerant Systems ((DEPENDABLECOMP,volume 9))

Abstract

This paper introduces algorithms for scheduling fault recovery operations on systems which must preserve the timing correctness of critical application tasks in the presence of faults. The algorithms are based on methods to reserve time for the processing of recovery tasks at the design stage. This allows recovery tasks to be scheduled with very low run-time overhead, complementing or reducing the need for hardware replication to support dependable operation. Although previous work has advocated the use of reservation methods, there exists no formal methodology for allocating such time. A methodology is developed which facilitates the difficult task of verifying the timing correctness of a desired reservation strategy. In addition, simulation results are presented which give insight into the effectiveness of different reservation strategies in averting timing failures under a variety of transient recovery loads.1

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. P. Hood and V. Grover. Designing real-time systems in ada. Technical Report 1123-1, SofTech Inc., 460 Totten Pold Road, Waltham, MA 022540-9197, January 1986.

    Google Scholar 

  2. Kopetz et.al. Distributed fault-tolerant real-time systems: The mars approach. IEEE Micro, 9(1):25–40, February 1989.

    Article  Google Scholar 

  3. V. Nirkhe and W. Pugh. A partial evaluator for the maruti hard real-time system. In Real-Time Systems Symposium, pages 64-73, Dec. 1991.

    Google Scholar 

  4. T.B. Smith III. The Fault-Tolerant Multiprocessor Computer. Moyes Publications, 1986.

    Google Scholar 

  5. Daniel Mosse. A Framework for the Development and Deployment of Fault-Tolerant Applications in Real-Time Environments. PhD thesis, University of Maryland, College Park, MD., 1992.

    Google Scholar 

  6. A.L. Liestman and R.H. Campbell. A fault-tolerant scheduling problem. IEEE Transactions on Software Engineering, SE-12(11), November 1986.

    Google Scholar 

  7. A. Wei, K. Hiraishi, R. Cheng, and R. Campbell. Application of the fault-tolerant deadline mechanism to a satellite on-board computer system. In 1980 International Symposium on Fault-Tolerant Computing, pages 107-109, June 1980.

    Google Scholar 

  8. H. Hecht. Fault-tolerant software for real-time applications. ACM Computing Surveys, 8(4):391–407, December 1976.

    Article  MATH  Google Scholar 

  9. C.M. Krishna and K.G. Shin. On scheduling tasks with a quick recovery from failure. IEEE Transactions on Computers, C-35(5):448–455, May 1986.

    Article  Google Scholar 

  10. S. Balaji, L. Jenkins, L.M. Patnaik, and P.S. Goel. Workload redistribution for fault-tolerance in a hard real-time distributed computing system. In 1989 International Symposium on Fault-Tolerant Computing, pages 366-373, Chicago, Illinois, June 1989.

    Google Scholar 

  11. R.H. Campbell, K.H. Horton, and G.C. Beiford. Simulations of a fault-tolerant deadline mechanism. In 1979 International Symposium on Fault-Tolerant Computing, pages 95-101, Madison, Wisconsin, June 1979.

    Google Scholar 

  12. J. Y.-T. Leung and J. Whitehead. On the complexity of fixed-priority scheduling of periodic real-time tasks. Performance Evaluation, 2:237–250, 1982.

    Article  MathSciNet  MATH  Google Scholar 

  13. C.L. Liu and J.W. Layland. Scheduling algorithms for multiprogramming in a hard-real-time environment. Journal of the Association for Computing Machinery, 20(l):46–61, January 1973.

    Article  MathSciNet  MATH  Google Scholar 

  14. John Lehoczky, Lui Sha, and Ye Ding. The rate-monotonic scheduling algorithm: Exact characterization and average case behavior. In Real-Time Systems Symposium, pages 166-171, 1989.

    Google Scholar 

  15. Sandra Ramos Thuel. Enhancing Fault Tolerance of Real-Time Systems through Time Redundancy. PhD thesis, Carnegie Mellon University, May 1993.

    Google Scholar 

  16. K. Fowler. Inertial navigation system simulator: Top-level design. Technical Report CMU/SEI-89-TR-38, Software Engineering Institute, January 1989.

    Google Scholar 

  17. D. Locke, D. Vogel, and T.J. Mesler. Building a predictable avionics platform in ada: A case study. In Real-Time Systems Symposium, pages 181-189, Dec. 1991.

    Google Scholar 

  18. S. Sathaye, D. Katcher, and J. Strosnider. Fixed priority scheduling with limited priority levels. Technical Report CMU-CDS-92-7, Carnegie Mellon University, August 1992.

    Google Scholar 

  19. B. Randell. System structure for software fault tolerance. IEEE Transactions on Software Engineering, pages 220-232, June 1975.

    Google Scholar 

  20. Daniel P. Siewiorek and Robert S. Swarz. Reliable Computer Systems. Digital Press, 1992.

    Google Scholar 

  21. J.D. Musa. A theory of software reliability and its application. IEEE Transactions on Software Engineering, pages 312-327, September 1975.

    Google Scholar 

  22. W.G. Bouricius. Reliability modeling for fault-tolerant computers. IEEE Transactions on Computers, C-20:1306–1311, Nov. 1971.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1995 Springer-Verlag/Wien

About this paper

Cite this paper

Ramos-Thuel, S., Strosnider, J.K. (1995). Scheduling Fault Recovery Operations for Time-Critical Applications. In: Cristian, F., Le Lann, G., Lunt, T. (eds) Dependable Computing for Critical Applications 4. Dependable Computing and Fault-Tolerant Systems, vol 9. Springer, Vienna. https://doi.org/10.1007/978-3-7091-9396-9_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-7091-9396-9_35

  • Publisher Name: Springer, Vienna

  • Print ISBN: 978-3-7091-9398-3

  • Online ISBN: 978-3-7091-9396-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics