Anomaly Detection in Complex Real World Application Systems
The ability to understand application performance and pro-Actively manage system state is becoming increasingly important as infrastructure services move toward commodisation models such as cloud computing. The complexity of systems being monitored in large corporations means that detailed component-by-component analysis and/or simulation of behaviour can be impractical. The key problem is to find practical systems management approaches that enable behavioural profiling and detection of anomalous events in complex real world systems. This paper details a system performance measurement method that enables slowdown event detection and characterisation of application behaviour. These measurement techniques can underpin IT service management frameworks such as the IT Infrastructure Library. This enables business to manage the stability and performance of the end-user experience in order to support productivity and effectiveness of business processes. This paper examines anomaly detection efficiency in two case studies using six detection models in a large Australian financial services organisation. The whole of service anomaly detection methodology proposed in this paper was found to be more efficient than alternative individual transaction and whole of service models.