Xception: Software Fault Injection and Monitoring in Processor Functional Units
This paper presents Xception, a software fault injection and monitoring environment. Xception uses the advanced debugging and performance monitoring features existing in most of the modern processors to inject more realistic faults by software, and to monitor the activation of the faults and their impact on the target system behaviour in detail. Faults are injected with minimum interference with the target application. The target application is not modified,no software traps are inserted, and it is not necessary to execute it in special trace mode (the application is executed at full speed). Xception provides a comprehensive set of fault triggers, including spatial and temporal fault triggers, and triggers related to the manipulation of data in memory. Faults injected by Xception can affect any process running on the target system including the operating system. Sets of faults can be defined by the user according to several criteria, including the emulation of faults in specific target processor functional units. Presently, Xception has been implemented on a parallel machine build around the PowerPC 601 processor running the PARIX operating system. Experiment results are presented showing the impact of faults on several parallel applications running on a commercial parallel system. It is shown that up to 73% of the faults, depending on the processor functional unit affected, can cause the application to produce wrong results. The results show that the impact of faults heavily depends on the application and the specific processor functional unit affected by the fault.
Fifth IFIP Working Conference on Dependable Computing for Critical Applications (DCCA-5), September 1995Cited by
