Timezone: »
Time series outlier detection has been extensively studied with many advanced algorithms proposed in the past decade. Despite these efforts, very few studies have investigated how we should benchmark the existing algorithms. In particular, using synthetic datasets for evaluation has become a common practice in the literature, and thus it is crucial to have a general synthetic criterion to benchmark algorithms. This is a non-trivial task because the existing synthetic methods are very different in different applications and the outlier definitions are often ambiguous. To bridge this gap, we propose a behavior-driven taxonomy for time series outliers and categorize outliers into point- and pattern-wise outliers with clear context definitions. Following the new taxonomy, we then present a general synthetic criterion and generate 35 synthetic datasets accordingly. We further identify 4 multivariate real-world datasets from different domains and benchmark 9 algorithms on the synthetic and the real-world datasets. Surprisingly, we observe that some classical algorithms could outperform many recent deep learning approaches. The datasets, pre-processing and synthetic scripts, and the algorithm implementations are made publicly available at https://github.com/datamllab/tods/tree/benchmark
Author Information
Kwei-Herng Lai (Rice University)
Daochen Zha (Texas A&M University)
Junjie Xu (Pennsylvania State University)
Yue Zhao (Carnegie Mellon University)
I am pursuing a Ph.D. in Information Systems at Carnegie Mellon University, advised by Prof. Leman Akoglu. Different from most IS researchers, I focus on data mining algorithms, systems, and applications. Research Keywords: Outlier & Anomaly Detection; Ensemble Learning; Scalable Machine Learning; Machine Learning Systems.
Guanchu Wang (Rice University)
Xia Hu (Texas A&M University)
More from the Same Authors
-
2021 : Therapeutics Data Commons: Machine Learning Datasets and Tasks for Drug Discovery and Development »
Kexin Huang · Tianfan Fu · Wenhao Gao · Yue Zhao · Yusuf Roohani · Jure Leskovec · Connor Coley · Cao Xiao · Jimeng Sun · Marinka Zitnik -
2022 Poster: ADBench: Anomaly Detection Benchmark »
Songqiao Han · Xiyang Hu · Hailiang Huang · Minqi Jiang · Yue Zhao -
2022 Poster: BOND: Benchmarking Unsupervised Outlier Node Detection on Static Attributed Graphs »
Kay Liu · Yingtong Dou · Yue Zhao · Xueying Ding · Xiyang Hu · Ruitong Zhang · Kaize Ding · Canyu Chen · Hao Peng · Kai Shu · Lichao Sun · Jundong Li · George H Chen · Zhihao Jia · Philip S Yu -
2021 Poster: Dirichlet Energy Constrained Learning for Deep Graph Neural Networks »
Kaixiong Zhou · Xiao Huang · Daochen Zha · Rui Chen · Li Li · Soo-Hyun Choi · Xia Hu -
2021 Poster: Fairness via Representation Neutralization »
Mengnan Du · Subhabrata Mukherjee · Guanchu Wang · Ruixiang Tang · Ahmed Awadallah · Xia Hu -
2021 Poster: Automatic Unsupervised Outlier Model Selection »
Yue Zhao · Ryan Rossi · Leman Akoglu -
2020 Poster: Towards Deeper Graph Neural Networks with Differentiable Group Normalization »
Kaixiong Zhou · Xiao Huang · Yuening Li · Daochen Zha · Rui Chen · Xia Hu -
2020 Poster: Detecting Interactions from Neural Networks via Topological Analysis »
Zirui Liu · Qingquan Song · Kaixiong Zhou · Ting-Hsiang Wang · Ying Shan · Xia Hu