Adaptive Profiling for Root-cause Analysis of Performance Anomalies in Web-based Applications

Authors

Abstract

The most important factor in the assessment of the availability of a system is the mean-time to repair (MTTR). The lower the MTTR the higher the availability. A significant portion of the MTTR is spent in the detection and localization of the cause of the failure. One possible method that may provide good results in the root-cause analysis of application failures is run-time profiling. The major drawback of run-time profiling is the performance impact.

In this paper we describe two algorithms for selective and adaptive profiling of web-based applications. The algorithms make use of a dynamic profiling interval and are mainly triggered when some of the transactions start presenting some symptoms of performance anomaly. The algorithms were tested under different types of degradation scenarios and compared to static sampling strategies. We observed through experimentation that the pinpoint of performance anomalies, supported by the data collected using the adaptive profiling algorithms, stills timely as with full-profiling while the response time overhead is reduced in almost 60%. When compared to a non-profiled version the response time overhead is less than 1.5%. These results show the viability of using run-time profiling to support quickly detection and pinpointing of performance anomalies and enable timely recovery.

Keywords

application profiling; monitoring; root-cause analysis; performance anomalies; dependability

Subject

Dependability Analysis

Conference

10th IEEE International Symposium on Network Computing and Applications, August 2011

Cited by

No citations found