A Multi-Criteria Complexity Evaluation of Cyberattack Detection Datasets in Industrial Control Systems
Abstract
The complexity of Industrial Control Systems (ICS) datasets plays a crucial role in determining effective detection strategies. The paper introduces a multi-dimensional and formal technique of measuring the complexity of a dataset, depending on twelve complexity measures in the dimensions of features-based, neighborhood-based, linearity-based, and topological measures. The measures enable class separability, local ambiguity, and complexity of decision boundaries to be evaluated in a classifier-independent manner, offering valuable insight into the structural and statistical challenges of ICS data. Additionally, we further employ Evaluation based on Distance from the Average Solution (EDAS), a Multi-criteria Decision-Making (MCDM) technique, that allows positive as well as negative deviations in the average performance for the ranking and comparison of datasets in terms of their intrinsic complexity. Findings show a lot of differences in complexity between the datasets benchmark and CISS2019.A1, where the most separable datasets are Dataset 8.3 and Dataset 7.3, and the most difficult ones to classify are Dataset 2.1 and CISS2019.A1(4).