Skip to main content

Advertisement

Log in

Hybrid scheduling to enhance reliability of real-time tasks running on reconfigurable devices

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Reconfigurable devices (RDs) are extremely advantageous when employed in real-time embedded systems. Nonetheless, they are susceptible to soft errors. In a broad sense, the present research addresses the challenge of improving the reliability of independent periodic real-time hardware tasks in RDs by utilizing hybrid fault-tolerant scheduling. The current paper combines static and dynamic real-time scheduling techniques to improve the reliability of the system. First, the proposed algorithm statically schedules primary tasks and preserves area and time for possible backup tasks on the RD. The overlapping of passive backup tasks is possible. Next, at the run time, event-triggered dispatcher dynamically determines which candidate backup copy should be selected for configuration on the overloaded preserved areas. Reliability, task deadline, and RD area limitations are the determining factors of backup overloading in the static phase. On the other hand, in the dynamic phase, the execution result of the primary tasks—in this case, success or failure—is the deciding factor based on which the dispatcher configures the true backup task on the preserved area. Experimental results show that the hybrid scheduling technique enhances the mean-time-to-failure of the system by an average factor of 1.22 in comparison with a similar state-of-the-art study.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. Throughout the remainder of the present paper, both transient and intermittent faults are referred to as soft errors (SEs).

References

  1. Cardoso J, Hübner M (2011) Reconfigurable computing: from FPGAs to hardware/software codesign. Springer, Berlin. https://doi.org/10.1007/978-1-4614-0061-5

    Book  Google Scholar 

  2. Cetin E, Diessel O, Li T, Ambrose JA, Fisk T, Parameswaran S, Dempster AG (2016) Overview and investigation of SEU detection and recovery approaches for FPGA-based heterogeneous systems. In: Rech P (ed) FPGAs and parallel architectures for aerospace applications. Springer, Berlin, pp 33–46. https://doi.org/10.1007/978-3-319-14352-1_3

    Chapter  Google Scholar 

  3. Vipin K, Fahmy SA (2018) FPGA dynamic and partial reconfiguration: a survey of architectures, methods, and applications. ACM Comput Surv (CSUR) 51:72. https://doi.org/10.1145/3193827

    Article  Google Scholar 

  4. Kean T, Buchanan I (1992) The use of FPGAs in a novel computing subsystem. Paper Presented at the Proceeding of 1st International ACM/SIGDA Workshop on FPGAs

  5. Hauck S (1998) The roles of FPGA’s in reprogrammable systems. Proc IEEE 86:615–638. https://doi.org/10.1109/5.663540

    Article  Google Scholar 

  6. Koch D, Ziener D, Hannig F (2016) FPGA versus software programming: why, when, and how? In: FPGAs for Software Programmers. Springer, Berlin, pp 1–21. https://doi.org/10.1007/978-3-319-26408-0_1

    Chapter  Google Scholar 

  7. Parrilla L, Álvarez-Bermejo JA, Castillo E, López-Ramos JA, Morales-Santos DP, García A (2018) Elliptic Curve Cryptography hardware accelerator for high-performance secure servers. J Supercomput. https://doi.org/10.1007/s11227-018-2317-6

    Article  Google Scholar 

  8. Kastensmidt FL, Carro L, da Luz Reis RA (2006) Fault-tolerance techniques for SRAM-based FPGAs. Springer, Berlin. https://doi.org/10.1007/978-0-387-31069-5

    Book  Google Scholar 

  9. Bolchini C, Miele A, Sandionigi C (2013) Autonomous fault-tolerant systems onto SRAM-based FPGA platforms. J Electron Test 29:779–793. https://doi.org/10.1007/s10836-013-5418-4

    Article  Google Scholar 

  10. Zhao Z, Nguyen NT, Agiakatsikas D, Lee G, Diessel O (2018) Fine-grained module-based error recovery in FPGA-based TMR systems. ACM Trans Reconfigurable Technol Syst 11:4. https://doi.org/10.1145/3173549

    Article  Google Scholar 

  11. Kastensmidt F, Rech P (2016) Radiation effects and fault tolerance techniques for FPGAs and GPUs. In: Rech P (ed) FPGAs and parallel architectures for aerospace applications. Springer, Berlin, pp 3–17. https://doi.org/10.1007/978-3-319-14352-1_1

    Chapter  Google Scholar 

  12. Krishna C (2014) Fault-tolerant scheduling in homogeneous real-time systems. ACM Comput Surv (CSUR). 46:48. https://doi.org/10.1145/2534028

    Article  MATH  Google Scholar 

  13. Ramezani R, Sedaghat Y, Naghibzadeh M, Clemente JA (2017) Reliability and makespan optimization of hardware task graphs in partially reconfigurable platforms. IEEE Trans Aerosp Electron Syst. https://doi.org/10.1109/TAES.2017.2667338

    Article  Google Scholar 

  14. Liang H, Sinha S, Zhang W (2018) Parallelizing hardware tasks on multicontext FPGA with efficient placement and scheduling algorithms. IEEE Trans Comput Aided Design Integr Circuits Syst 37:350–363. https://doi.org/10.1109/TCAD.2017.2697952

    Article  Google Scholar 

  15. Stoddard A, Gruwell A, Zabriskie P, Wirthlin MJ (2017) A hybrid approach to FPGA configuration scrubbing. IEEE Trans Nucl Sci 64:497–503. https://doi.org/10.1109/TNS.2016.2636666

    Article  Google Scholar 

  16. Zhang H, Kochte MA, Imhof ME, Bauer L, Wunderlich H-J, Henkel J (2014) GUARD: Guaranteed reliability in dynamically reconfigurable systems. Paper presented at the Proceedings of the 51st Annual Design Automation Conference. https://doi.org/10.1145/2593069.2593146

  17. Santos R, Venkataraman S, Kumar A (2017) Scrubbing mechanism for heterogeneous applications in reconfigurable devices. ACM Trans Design Autom Electron Syst 22:33. https://doi.org/10.1145/2997646

    Article  Google Scholar 

  18. Giordano R, Perrella S, Izzo V, Milluzzo G, Aloisio A (2017) Redundant-configuration scrubbing of SRAM-based FPGAs. IEEE Trans Nucl Sci 64:2497–2504. https://doi.org/10.1109/TNS.2017.2730960

    Article  Google Scholar 

  19. Sterpone L, Violante M (2006) A new reliability-oriented place and route algorithm for SRAM-based FPGAs. IEEE Trans Comput. https://doi.org/10.1109/TC.2006.82

    Article  Google Scholar 

  20. Huang K, Hu Y, Li X (2014) Reliability-oriented placement and routing algorithm for SRAM-based FPGAs. IEEE Trans Very Large Scale Integr (VLSI) Syst 22:256–269. https://doi.org/10.1109/TVLSI.2013.2239318

    Article  Google Scholar 

  21. Bolchini C, Miele A, Sandionigi C (2011) A novel design methodology for implementing reliability-aware systems on SRAM-based FPGAs. IEEE Trans Comput 60:1744–1758. https://doi.org/10.1109/TC.2010.281

    Article  MathSciNet  MATH  Google Scholar 

  22. Tambara LA, Almeida F, Rech P, Kastensmidt FL, Bruni G, Frost C (2015) Measuring failure probability of coarse and fine grain TMR schemes in SRAM-based FPGAs under neutron-induced effects. Paper presented at the International Symposium on Applied Reconfigurable Computing. https://doi.org/10.1007/978-3-319-16214-0_28

    Google Scholar 

  23. Yang M, Hua G, Feng Y, Gong J (2017) Fault-tolerance techniques for spacecraft control computers. Wiley, London. https://doi.org/10.1002/9781119107392

    Book  Google Scholar 

  24. Xie G, Zeng G, Chen Y, Bai Y, Zhou Z, Li R, Li K (2017) Minimizing redundancy to satisfy reliability requirement for a parallel application on heterogeneous service-oriented systems. IEEE Trans Serv Comput. https://doi.org/10.1109/TSC.2017.2665552

    Article  Google Scholar 

  25. Pathan RM (2017) Real-time scheduling algorithm for safety-critical systems on faulty multicore environments. Real-Time Syst 53:45–81. https://doi.org/10.1007/s11241-016-9258-z

    Article  MATH  Google Scholar 

  26. Kopetz H (2011) Real-time systems: design principles for distributed embedded applications. Springer, Berlin. https://doi.org/10.1007/978-1-4419-8237-7

    Book  MATH  Google Scholar 

  27. Pathan RMJR-TS (2014) Fault-tolerant and real-time scheduling for mixed-criticality systems. Real-Time Syst 50:509–547. https://doi.org/10.1007/s11241-014-9202-z

    Article  MATH  Google Scholar 

  28. Kim J, Lakshmanan K, Rajkumar R (2010) R-BATCH: task partitioning for fault-tolerant multiprocessor real-time systems. Paper presented at the 2010 IEEE 10th International Conference on Computer and Information Technology (CIT). https://doi.org/10.1109/CIT.2010.321

  29. Zhu X, Wang J, Guo H, Zhu D, Yang LT, Liu L (2016) Fault-tolerant scheduling for real-time scientific workflows with elastic resource provisioning in virtualized clouds. IEEE Trans Parallel Distrib Syst 27:3501–3517. https://doi.org/10.1109/TPDS.2016.2543731

    Article  Google Scholar 

  30. Löfwenmark A, Nadjm-Tehrani S (2018) Fault and timing analysis in critical multi-core systems—a survey with an avionics perspective. J Syst Archit. https://doi.org/10.1016/j.sysarc.2018.04.001

    Article  Google Scholar 

  31. Yin J-Y, Guo G-C, Wu Y-X (2009) A hybrid fault-tolerant scheduling algorithm of periodic and aperiodic real-time tasks to partially reconfigurable FPGAs. Paper presented at the 2009 ISA 2009 International Workshop on Intelligent Systems and Applications. https://doi.org/10.1109/IWISA.2009.5072624

  32. Yin J, Zheng B, Sun Z (2012) A hybrid real-time fault-tolerant scheduling algorithm for partial reconfigurable system. JCP 7:2773–2780. https://doi.org/10.4304/jcp.7.11.2773-2780

    Article  Google Scholar 

  33. Ramezani R, Sedaghat Y, Clemente JA (2017) Reliability improvement of hardware task graphs via configuration early fetch. IEEE Trans Very Large Scale Integr (VLSI) Syst 25:1408–1420. https://doi.org/10.1109/TVLSI.2016.2631724

    Article  Google Scholar 

  34. Say F, Bazlamaçcı CF (2012) A reconfigurable computing platform for real time embedded applications. Microprocess Microsyst 36:13–32. https://doi.org/10.1016/j.micpro.2011.08.013

    Article  Google Scholar 

  35. Herrera-Alzu I, Lopez-Vallejo M (2014) System design framework and methodology for Xilinx Virtex FPGA configuration scrubbers. IEEE Trans Nucl Sci 61:619–629. https://doi.org/10.1016/j.micpro.2011.08.013

    Article  Google Scholar 

  36. Monson JS, Wirthlin M, Hutchings B (2012) A fault injection analysis of Linux operating on an FPGA-embedded platform. Int J Reconfigurable Comput 2012:7. https://doi.org/10.1155/2012/850487

    Article  Google Scholar 

  37. Ramezani R, Sedaghat Y, Naghibzadeh M, Clemente JA (2018) A decomposition-based reliability and makespan optimization technique for hardware task graphs. Reliab Eng Syst Saf 180:13–24. https://doi.org/10.1016/j.ress.2018.07.007

    Article  Google Scholar 

  38. Clemente JA, Resano J, González C, Mozos D (2011) A hardware implementation of a run-time scheduler for reconfigurable systems. IEEE Trans Very Large Scale Integr (VLSI) Syst 19:1263–1276. https://doi.org/10.1109/TVLSI.2010.2050158

    Article  Google Scholar 

  39. Bushnell M, Agrawal V (2004) Essentials of electronic testing for digital, memory and mixed-signal VLSI circuits. Springer, Berlin. https://doi.org/10.1007/b117406

    Book  Google Scholar 

  40. Ramezani R, Clement JA, Sedaghat Y, Mecha H (2016) Estimation of hardware task reliability on partially reconfigurable FPGAs. Paper presented at the 2016 16th European Conference on Radiation and Its Effects on Components and Systems (RADECS). https://doi.org/10.1109/RADECS.2016.8093184

  41. Mottaghi MH, Zarandi HR (2014) DFTS: a dynamic fault-tolerant scheduling for real-time tasks in multicore processors. Microprocess Microsyst 38:88–97. https://doi.org/10.1016/j.micpro.2013.11.013

    Article  Google Scholar 

  42. Hazucha P, Svensson C (2000) Impact of CMOS technology scaling on the atmospheric neutron soft error rate. IEEE Trans Nucl Sci 47:2586–2594. https://doi.org/10.1109/23.903813

    Article  Google Scholar 

  43. Koren I, Krishna CM (2010) Fault-tolerant systems. Morgan Kaufmann, Burlington. https://doi.org/10.1016/b978-0-12-088525-1.x5000-7

    Book  MATH  Google Scholar 

  44. Namazi A, Safari S, Mohammadi S (2018) CMV: clustered majority voting reliability-aware task scheduling for multicore real-time systems. IEEE Trans Reliab. https://doi.org/10.1109/TR.2018.2869786

    Article  Google Scholar 

  45. Kuo W, Prasad VR (2000) An annotated overview of system-reliability optimization. IEEE Trans Reliab 49:176–187. https://doi.org/10.1109/24.877336

    Article  Google Scholar 

  46. Haahr M (2019) RANDOM.ORG: True Random Number Service. https://www.random.org. Accessed Sept 2018

  47. Clemente JA, Beretta I, Rana V, Atienza D, Sciuto D (2014) A mapping-scheduling algorithm for hardware acceleration on reconfigurable platforms. ACM Trans Reconfigurable Technol Syst (TRETS) 7:9. https://doi.org/10.1145/2611562

    Article  Google Scholar 

  48. Danne K, Platzner M (2006) An EDF schedulability test for periodic tasks on reconfigurable hardware devices. Paper presented at the ACM SIGPLAN Notices. https://doi.org/10.1145/1159974.1134665

    Article  Google Scholar 

  49. Steiger C, Walder H, Platzner M, Thiele L (2003) Online scheduling and placement of real-time tasks to partially reconfigurable devices. Paper presented at the RTSS 2003. 24th IEEE Real-Time Systems Symposium. https://doi.org/10.1109/REAL.2003.1253269

  50. XilinxCorporation Virtex-5 FPGA Configuration User Guide, UG191 (v 3.11). http://www.xilinx.com/support/documentation/user_guides/ug191.pdf. Accessed Sept 2018

  51. Tylka AJ, Adams JH, Boberg PR, Brownstein B, Dietrich WF, Flueckiger EO, Petersen EL, Shea MA, Smart DF, Smith EC (1997) CREME96: a revision of the cosmic ray effects on micro-electronics code. IEEE Trans Nucl Sci 44:2150–2160. https://doi.org/10.1109/23.659030

    Article  Google Scholar 

Download references

Acknowledgements

The authors wish to sincerely acknowledge and thank Dr. Reza Ramezani, assistant professor at the University of Isfahan. He, generously, provided many constructive comments that greatly assisted the research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yasser Sedaghat.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ghavidel, A., Sedaghat, Y. & Naghibzadeh, M. Hybrid scheduling to enhance reliability of real-time tasks running on reconfigurable devices. J Supercomput 76, 4701–4730 (2020). https://doi.org/10.1007/s11227-019-02976-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-019-02976-6

Keywords

Navigation