Cross-dataset time series anomaly detection for cloud systems
- Xu Zhang ,
- Qingwei Lin 林庆维 ,
- Yong Xu ,
- Si Qin ,
- Hongyu Zhang ,
- Bo Qiao ,
- Yingnong Dang ,
- Xinsheng Yang ,
- Qian Cheng ,
- Murali Chintalapati ,
- Youjiang Wu ,
- Ken Hsieh ,
- Kaixin Sui ,
- Xin Meng ,
- Yaohai Xu ,
- Wenchi Zhang ,
- Furao Shen ,
- Dongmei Zhang
USENIX ATC'19 |
In recent years, software applications are increasingly deployed as online services on cloud computing platforms. It is important to detect anomalies in cloud systems in order to maintain high service availability. However, given the velocity, volume, and diversified nature of cloud monitoring data, it is difficult to obtain sufficient labelled data to build an accurate anomaly detection model. In this paper, we propose cross-dataset anomaly detection: detect anomalies in a new unlabelled dataset (the target) by training an anomaly detection model on existing labelled datasets (the source). Our approach, called ATAD (Active Transfer Anomaly Detection), integrates both transfer learning and active learning techniques. Transfer learning is applied to transfer knowledge from the source dataset to the target dataset, and active learning is applied to determine informative labels of a small part of samples from unlabelled datasets. Through experiments, we show that ATAD is effective in cross-dataset time series anomaly detection. Furthermore, we only need to label about 1%-5% of unlabelled data and can still achieve significant performance improvement.