CISUC

Reset-Driven Fault Tolerance

Authors

Abstract

A common approach in embedded systems to achieve fault-tolerance is to reboot the computer whenever some non-permanent error is detected. All the system code and data are recreated from scratch, and a previously established checkpoint, hopefully not corrupted, is used to restart the application data. The confidence is thus restored on the activity of the computer.
The idea explored in this paper is that of unconditionally resetting the computer in each control frame (the classic read sensors -> calculate control action -> update actuators cycle). A stable-storage based in RAM is used to preserve the system's state between consecutive cleanups and a standard watchdog timer guarantees that a reset is forced whenever an error crashes the system.
We have evaluated this approach by using fault-injection in the controller of a standard temperature control system. The experimental observations show that the Reset-Driven Fault Tolerance is a very simple yet effective technique to improve reliability at an extremely low cost since it is a conceptually simple, software only solution with the advantage of being application independent.

Keywords

reset-driven fault tolerance, fault removal, dependability, embedded real-time systems

Subject

Fault-Tolerance in Control Systems

Conference

4th European Dependable Computing Conference (EDCC-4), October 2002


Cited by

Year 2013 : 1 citations

 Trawczynski, D. and Zalewski, J., "Application of Accelerated Processing Units in Safety-Critical Systems," SAE Int. J. Passeng. Cars – Electron. Electr. Syst. 6(1):93-101, 2013, doi:10.4271/2013-01-0179

Year 2010 : 1 citations

 • Daniel Skarin, Raul Barbosa, Johan Kar1.sson, "Comparing and Validating Measurements of Dependability Attributes," Dependable Computing Conference (EDCC), 2010 European , vol., no., pp.3-12, 28-30 April 2010

Year 2009 : 1 citations

 1. Daniel Skarin, Johan Karlsson, "Evaluation of low-cost detection and recovery of soft errors in an ABS controller”, Proceedings of the 2009 IEEE Workshop on Silicon Errors in Logic - System Effects (SELSE 5), 2009

Year 2004 : 1 citations

 1. Yuste Pérez Pedro, “Contribución a la Validación de la Confiabilidad en los Sistemas Empotrados Tolerantes a Fallos”, Phd Thesis, Universidad: Politecnica De Valencia, 2004