Enhancing Fault Tolerance of Real-Time Systems through Time Redundancy

Thuel, Sandra R.; Strosnider, Jay K.

doi:10.1007/978-0-585-28002-8_9

Sandra R. Thuel² &
Jay K. Strosnider³

Part of the book series: The Kluwer International Series in Engineering and Computer Science ((SECS,volume 285))

55 Accesses
3 Citations

Abstract

Fault-tolerant, real-time systems require correct, time-constrained results in the presence of faults. Missed deadlines in many high dependability systems can result in significant property damage or loss of human life. Historically, designers relied almost exclusively upon massive hardware replication to achieve their dependability goals. Research suggests that not only is this approach inadequate for dealing with certain fault classes, but also that it is inappropriate for many applications with strict space, weight, and cost constraints. Alternatively, time redundancy can be used to complement replication as a means to improve fault coverage and reduce the required level of replication for fault-tolerant system design. Although previous work has advocated the use of time redundancy to provide protection against hardware and software faults, there exists no formal methodology for allocating and managing such time. This chapter provides an overview of recent work in developing a comprehensive analytical framework for allocating and managing time redundancy to preserve the timing correctness of priority-driven, real-time systems in the presence of faults.

Research supported in part by Office of Naval Research under Contract N00014-92-J-1524 and by AT&T Bell Laboratories under the Cooperative Research Fellowship Program.

This work was done while the author was at Carnegie Mellon University, Pittsburgh, PA.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Farnam Jahanian. State restoration in real-time fault-tolerant systems. Complex Systems Engineering Synthesis and Assessment Technology Workshop, pages 21–29, July 1992.
Google Scholar
Tom Hand. Real-time systems need predictability. Computer Design RISC, Supplement:57–59, August 1989.
Google Scholar
H. Kopetz, H. Kantz, G. Grunsteidl, P. Puschner, and J. Reisinger. Tolerating transient faults in mars. In International Symposium on Fault-Tolerant Computing, pages 466–473, NewCastle Upon Tyne, U.K., June 1990.
Google Scholar
C.M. Krishna and A.D. Singh. Modelling correlated transient failures in fault-tolerant systems. In 1989 International Symposium on Fault-Tolerant Computing, pages 374–381, Chicago, Illinois, June 1989.
Google Scholar
A.L. Hopkins, T.B. Smith III, and J.H. Lala. Ftmp — a highly reliable fault-tolerant multiprocessor for aircraft. Proceedings of the IEEE 66, pages 1221–1239, October 1978.
Google Scholar
J. Goldberg et.al. Development and analysis of the software implemented fault-tolerance (sift) computer. Technical report, NASA CR-172146, 1984.
Google Scholar
J.H. Wensley et.al. Sift: The design and analysis of a fault-tolerant computer for aircraft control. Proceedings of the IEEE 66, 66(10), October 1978.
Google Scholar
J.H. Lala and L.S. Alger. Hardware and software fault tolerance: A unified architectural approach. In 1988 International Symposium on Fault-Tolerant Computing, pages 240–245, Tokyo, Japan, June 1988.
Google Scholar
Y.K. Malaiya. Linearly correlated intermittent failures. IEEE Transactions on Reliability, R-31(2), 1982.
Google Scholar
S.R. McConnel, D.P. Siewiorek, and M.M. Tsao. The measurement and analysis of transient errors in digital computing systems. In Digest of Papers, Ninth Annual International Conference on Fault-Tolerant Computing, pages 67–70, 1979.
Google Scholar
Ting-Ting Y. Lin. Design and Evaluation of an On-line Predictive Diagnostic System. PhD thesis, Carnegie Mellon University, May 1988.
Google Scholar
Jim Gray. Why do computers stop and what can be done about it? In Fifth Symposium on Reliability in Distributed Software and Database Systems, pages 374–381, Los Angeles, California, Jan. 1986.
Google Scholar
Daniel P. Siewiorek. Architecture of fault-tolerant computers: An historical perspective. In Proceedings of the IEEE, volume 79, pages 1–25, December 1991.
Google Scholar
B. Randell. System structure for software fault tolerance. IEEE Transactions on Sofware Engineering, pages 220–232, June 1975.
Google Scholar
A. Avizienis and J. Kelly. Fault tolerance by design diversity: Concepts and experiments. IEEE Computer, August 1984.
Google Scholar
L.J. Yount. Architectural solutions to safety problems for commercial transports. Proceedings of the 6th AIAA/IEEE Digital Avionics Systems Conference, December 1984.
Google Scholar
G.F. Sullivan and G.M. Masson. Using certification trails to achieve software fault tolerance. In Proceedings of the IEEE 1990 Fault-Tolerant Computing Symposium, pages 423–431, 1990.
Google Scholar
G.F. Sullivan and G.M. Masson. Certification trails for data structures. Technical Report JHU 90/17, John Hopkins University, MD., 1990.
Google Scholar
P. Hood and V. Grover. Designing real-time systems in ada. Technical Report 1123-1, SofTech Inc., 460 Totten Pold Road, Waltham, MA 022540-9197, January 1986.
Google Scholar
Kopetz et.al. Distributed fault-tolerant real-time systems: The mars approach. IEEE Micro, 9(1):25–40, February 1989.
Article Google Scholar
V. Nirkhe and W. Pugh. A partial evaluator for the maruti hard real-time system. In Real-Time Systems Symposium, pages 64–73, Dec. 1991.
Google Scholar
J. Stankovic and K. Ramamritham. The spring kernel: A new paradigm for real-time operating systems. ACM Operating Systems Review, 23(3), July 1989.
Google Scholar
T.B. Smith III. The Fault-Tolerant Multiprocessor Computer. Moyes Publications, 1986.
Google Scholar
James Gafford. Rate monotonic scheduling. IEEE Micro, pages 34–38, June 1991.
Google Scholar
Sandra Ramos Thuel. Enhancing Fault Tolerance of Real-Tine Systems through Time Redundancy. Ph D thesis, Carnegie Mellon University, May 1993.
Google Scholar
H. Chetto and M. Chetto. Some results of the earliest deadline scheduling algorithm. IEEE Transactions on SW Eng., 15(10):466–473, 1989.
Google Scholar
K. Schwan and H. Zhou. Dynamic scheduling of hard real-time tasks and real-time threads. IEEE Transactions on SW Eng., 18(8):736–748, 1992.
Article Google Scholar
C.L. Liu and J.W. Layland. Scheduling algorithms for multiprogramming in a hard-real-time environment. Journal of the Association for Computing Machinery, 20(l):46–61, January 1973.
MATH Google Scholar
J. Y.-T. Leung and J. Whitehead. On the complexity of fixed-priority scheduling of periodic real-time tasks. Performance Evaluation, 2:237–250, 1982.
Article Google Scholar
John Lehoczky, Lui Sha, and Ye Ding. The rate-monotonic scheduling algorithm: Exact characterization and average case behavior. In Real-Time Systems Symposium, pages 166–171, 1989.
Google Scholar
Sandra Ramos-Thuel and Jay K. Strosnider. Scheduling fault recovery operations for time-critical applications. In Proceedings of Dependable Computing for Critical Applications, January 1994.
Google Scholar
Sandra Ramos-Thuel and John P. Lehoczky. An optimal algorithm for scheduling soft-aperiodic tasks in fixed-priority preemptive systems. In Real-Time Systems Symposium, pages 100–110, December 1992.
Google Scholar
Sandra Ramos-Thuel and John P. Lehoczky. On-line scheduling of hard deadline aperiodic tasks in fixed-priority systems. In Proceedings of the Real-Time Systems Symposium, pages 160–171, December 1993.
Google Scholar
Edward C. Russell. Building Simulation Models with SIMSCRIPT II.5. CACI Inc., 1983.
Google Scholar
W.G. Bouricius. Reliability modeling for fault-tolerant computers. IEEE Transactions on Computers, C-20:1306–1311, Nov. 1971.
Google Scholar
K. Fowler. Inertial navigation system simulator: Top-level design. Technical Report CMU/SEI-89-TR-38, Software Engineering Institute, January 1989.
Google Scholar
W. Stallings. Data and Computer Communications. Macmillan, N.Y., N.Y., 1985.
Google Scholar

Download references

Author information

Authors and Affiliations

AT&0026;T Bell Laboratories, Holmdel, NJ, 07733
Sandra R. Thuel
Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA, 15213
Jay K. Strosnider

Authors

Sandra R. Thuel
View author publications
You can also search for this author in PubMed Google Scholar
Jay K. Strosnider
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Office of Naval Research, USA
Gary M. Koob & Clifford G. Lau &

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Thuel, S.R., Strosnider, J.K. (1994). Enhancing Fault Tolerance of Real-Time Systems through Time Redundancy. In: Koob, G.M., Lau, C.G. (eds) Foundations of Dependable Computing. The Kluwer International Series in Engineering and Computer Science, vol 285. Springer, Boston, MA. https://doi.org/10.1007/978-0-585-28002-8_9

Download citation

DOI: https://doi.org/10.1007/978-0-585-28002-8_9
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-7923-9486-0
Online ISBN: 978-0-585-28002-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics