CISUC

Recovery and Performance Balance of a COTS DBMS in the Presence of Operator Faults

Authors

Abstract

A major cause of failures in large database manage-ment systems (DBMS) is operator faults. Although most of the complex DBMS have comprehensive recovery mecha¬nisms, the effectiveness of these mechanisms is difficult to characterize. On the other hand, the tuning of a large database is very complex and database administrators tend to concentrate on performance tuning and disregard the recovery mechanisms. Above all, database adminis¬trators seldom have feedback on how good a given con¬figuration is concerning recovery. This paper proposes an experimental approach to characterize both the perfor-mance and the recoverability in DBMS. Our approach is presented through a concrete example of benchmarking the performance and recovery of an Oracle DBMS run¬ning the standard TPC-C benchmark, extended to include two new elements: a faultload based on operator faults and measures related to recoverability. A classification of operator faults in DBMS is proposed. The paper ends with the discussion of the results and the proposal of guidelines to help database administrators in finding the balance between performance and recovery tuning.

Subject

Experimental COTS Evaluation

Related Project

DBench - Dependability Benchmarking

Conference

The International Symposium on Dependable Systems and Networks, DSN-IPDS 2002, June 2002

PDF File


Cited by

Year 2012 : 1 citations

 Bo Yang, Ji Wu, Chao Liu, "Mining Data Chain Graph for Fault Localization", IEEE 36th Annual Computer Software and Applications Conference Workshops, COMPSACW 2012, Izmir, Turkey, July 16-20, 2012.

Year 2011 : 2 citations

 Gong Zhang, Ling Liu, "Why do migrations fail and what can we do about it?", 25th international conference on Large Installation System Administration, LISA'11, Boston, MA, USA, December 4-9, 2011.

 Gong Zhang, "Data and Application Migration in Cloud Based Data Centers: Architectures and Techniques", PhD Thesis, Georgia Institute of Technology, USA, August 2011.

Year 2010 : 1 citations

 1. Nikola Milanovic, Bratislav Milic, “Automatic Generation of Service Availability Models”, IEEE Transactions on Services Computing, no. 1, ISSN: 1939-1374, March 2010.

Year 2008 : 3 citations

 1. Lakshmi Narayanan Bairavasundaram, “Characteristics, Impact, and Tolerance of Partial Disk Failures”, PhD Thesis, University of Wisconsin–Madison, USA, 2008.

 2. Lorenzo Keller, Prasang Upadhyaya, George Candea, “ConfErr: A Tool for Assessing Resilience to Human Configuration Errors”, IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2008, Anchorage, Alaska, USA, June 24-27, 2008.

 3. Bogdan Tomoyuki Nassu, Kiyonobu Uehara, Takashi Nanya, “Injecting Inconsistent Values Caused by Interaction Faults for Experimental Dependability Evaluation”, Seventh European Dependable Computing Conference, EDCC 2008, Kaunas, Lithuania, May 7-9, 2008.

Year 2007 : 2 citations

 1. Andréas Johansson, “Robustness Evaluation of Operating Systems”, PhD Thesis, Vom Fachbereich Informatik der Technischen Universit, University of Darmstad, Germany, 2007.

 Eliane Martins, Regina Moraes, "Research in Software Testing at State University of Campinas", Technical Report - REVVIS Project, 2007.

Year 2005 : 2 citations

 1. Ali Kalakech, “Étalonnage de la sûreté de fonctionnement des systèmes d’exploitation – Spécifications et mise en œuvre”, PhD Thesis, LAAS-CNRS, Toulouse, France, August 2005.

 2. Regina Moraes, Eliane Martins, Naaliel Mendes, "Fault injection approach based on dependence analysis", 29th Annual International Conference on Computer Software and Applications, COMPSAC 2005, Edinburgh, Scotland, UK, July 26-28, 2005.

Year 2004 : 1 citations

 1. Aaron B. Brown, Leonard Chung, William Kakes, Calvin Ling, David A. Patterson, “Experience with Evaluating Human-Assisted Recovery Processes”, IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2004, Florence, Italy, June 28-July 1, 2004.

Year 2003 : 2 citations

 1. Aaron Brown, “A Recovery-Oriented Approach to Dependable Services: Reairing Past Errors with System-Wide Undo”, PhD Thesis, EECS Computer Science Division - University of California, Berkeley, California, USA, December 2003.

 2. Regina Morais, Eliane Martins, "A strategy for validating an ODBMS Component Using a High-Level Software Fault Injection Tool", First Latin-American Symposium on Dependable Computing, LADC 2003, São Paulo, Brazil, October 21-24, 2003.

Year 2002 : 1 citations

 1. David Oppenheimer, Aaron B. Brown, Jonathan Traupman, Pete Broadwell, David A. Patterson, “Practical Issues in Dependability Benchmarking”, Second Workshop on Evaluating and Architecting System Dependability (EASY), São Jose, California, USA, October 6, 2002.