This paper evaluates the impact of transient errors in the operating system of a COTS-based system (CETIA board with two PowerPC 750 processors running LynxOS) and quantifies their effects at both the OS and at the application level. The study has been conducted using a Software-Implemented Fault Injection tool (Xception) and both realistic programs and synthetic workloads (to focus on specific OS features) have been used. The results provide a comprehensive picture of the impact of faults on LynxOS key features (process scheduling and the most frequent system calls), data integrity, error propagation, application termination, and correctness of application results.
Keywords
COTS, Experimental Evaluation, Space Applications
Subject
Experimental COTS Evaluation
Conference
The International Conference on Dependable Systems and Networks, DSN-2002, June 2002
Cited by
Year 2009 : 1 citations
Navid Aghdaie and Yuval Tamir, "CoRAL: A transparent fault-tolerant web service?, Journal of Systems and Software, Volume 82, Issue 1, January 2009, Pages 131-143
Year 2008 : 2 citations
Andreas Johansson, "Robustness Evaluation of Operating Systems?, PhD thesis, Darmstadt University, 2008
Mahdi Fazeli, Reza Farivar, and Seyed Ghassem Miremadi, "Error Detection Enhancement in PowerPC Architecture-based Embedded Processors?, Journal of Electronic Testing, Volume 24, Numbers 1-3 / June, 2008
Year 2006 : 4 citations
J. Sosnowski, P. Gawkowski, P. Zygulski, A. Tymoczko, "Enhancing Fault Injection Testbench," depcos-relcomex, pp. 76-83, International Conference on Dependability of Computer Systems (DEPCOS-RELCOMEX'06), 2006.
Ana Maria Ambrosio, Eliane Martins, Nandamudi L. Vijaykumar, Solon V. de Carvalho, "A Conformance Testing Process for Space Applications Software Services?, Journal of Aerospace Computing, Information and Communication, 3(4), pp. 146-158, 2006.
André Fidalgo, Manuel Gericota, Gustavo Alves, José Ferreira, "Test and verification: Using NEXUS compliant debuggers for real time fault injection on microprocessors?, Proceedings of the 19th annual symposium on Integrated circuits and systems design SBCCI '06, August 2006
Yen-Jen Chang, "An Energy-Efficient BTB Lookup Scheme for Embedded Processors," Circuits and Systems II: Express Briefs, IEEE Transactions on , vol.53, no.9, pp.817-821, Sept. 2006
Year 2005 : 5 citations
1. Amir Rajabzadeh, Seyed Ghassem Miremadi, "A Hardware Approach to Concurrent Error Detection Capability Enhancement?, in COTS Processors in Proc. Of 11th IEEE International Symposium Pacific Rim Dependable Computing, PRDC2005, Changsha, Hunan, China, December 2005.
2. N. Aghdaie and Y. Tamir, ""Efficient Client-Transparent Fault Tolerance for Video Conferencing,"" Proceedings of the 3rd IASTED International Conference on Communications and Computer Networks, Marina del Rey, CA, October 2005.
3. Navid Aghdaie, "Transparent Fault-Tolerant Network Services Using Off-the-Shelf Components?, PhD thesis, University of Calofornia, Los Angeles, 2005.
4. Mahdi Fazeli, Reza Farivar, Ghassem Miremadi, "A Software-Based Concurrent Error Detection Technique for PowerPC Processor-based Embedded Systems?, 20th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (DFT'05), Moterey, CA, EUA, Outubro 2005.
5. P. Gawkowski, J. Sosnowski, "Analysing system susceptibility to faults with simulation tools?, XXI Autumn Meeting of Polish Information Processing Society, ISBN 83-922646-0-6, pp. 87-94, PIPS, 2005.
Year 2004 : 6 citations
P Bernadat, DD Mannaru, "Susceptibility of Commodity Systems and Software to Memory Soft Errors?, IEEE Transactions on Computers, December 2004 (Vol. 53, No. 12), 2004.
K. Whisnant et al., "The Effects of an Armor-Based SIFT Environment on the Performance and Dependability of User Applications,? IEEE Trans. Software Eng., vol. 30, no. 4, 2004.
Alan Messer et al., "Susceptibility of commodity systems and software to memory soft errors,? IEEE Transactions on Computing, December 2004 (Vol. 53, No. 12), 2004.
Amir Rajabzadeh, Seyed Ghassem Miremadi, and Mirzad Mohandespour, "Error Detection Enhancement in COTS Superscalar Processors with Performance Monitoring Features?, Journal of Electronic Testing, Springer Science+Business Media B.V., Formerly Kluwer Academic Publishers B.V., ISSN: 0923-8174, Volume 20, Number 5, Outubro de 2004.
Arnaud Albinet, Jean Arlat, Jean-Charles Fabre, "Characterization of the Impact of Faulty Drivers on the Robustness of the Linux Kernel?, IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2004, Florença, Itália, 28 Junho a 1 Julho de 2004.
Joakim Aidemark, Peter Folkesson and Johan Karlsson, "Experimental Dependability Evaluation of the Artk68-FT Real-time Kernel?, International Conference on Real-Time and Embedded Computing Systems and Applications, Göteborg, Suécia, 2004.
Year 2003 : 3 citations
P. Gawkowski, J. Sosnowski, "Assessing Software Implemented Fault Detection and Fault Tolerance Mechanisms?, Proceedings of the IEEE 12th Asian Test Symposium (ATS"03), 16-19 de Novembro de 2003.
Weining Gu, Z. Kalbarczyk, R. Iyer, Z. Yang, "Characterization of linux kernel behavior under errors", IEEE/IFIP International Conference on Dependable Systems and Networks, International Performance and Dependability Symposium, DSN-IPDS 2003, San Francisco, CA, USA, pp. 459-468, June 22-25, 2003.
Katz D. S., Some R. R., "NASA advances robotic space exploration", IEEE Computer, 36 (1): 52-+, Jan. 2003.
Year 2002 : 2 citations
D. Wilson, B. Murphy, and L. Spainhower, "Progress on defining standardized classes for comparing the dependability of computer systems", supplemental volume of the IEEE/IFIP International Conference on Dependable Systems and Networks, DSN-2002, Bethesda, Maryland, USA, pp. F1-F5, June 23-26, 2002.
Iyer R. K., Kalbarczyk Z., "Measurement-based analysis of system dependability using fault injection and field failure data", Performance Evaluation of Complex Systems: Techniques and Tools , Lecture Notes in Computer Science, 2459: pp. 290-317, 2002.