Skip to yearly menu bar Skip to main content

Workshop: Data Centric AI

Homogenization of Existing Inertial-Based Datasets to Support Human Activity Recognition


Several techniques have been proposed to address the problem of recognizing activities of daily living from signals. Deep learning techniques applied to inertial signals have proven to be effective, achieving significant classification accuracy. Recently, research in human activity recognition (HAR) models has been almost totally model-centric. It has been experimented that feeding high-quality training data allows deep learning models to both perform well independently of their architecture and to be more robust to intraclass variability and interclass similarity. Unfortunately, the data in the available datasets are not always of high quality. Moreover, the performance of a deep learning algorithm is proportional to the size of the dataset used to generate it. The publicly available datasets mostly are small in terms of number of subjects and/or types of activities performed. Moreover, datasets are heterogeneous among them and therefore cannot be trivially combined to obtain a larger set.

The final aim of our work is the definition and implementation of a platform that integrates datasets of inertial signals in order to make available to the scientific community large datasets of homogeneous signals, enriched, when possible, with context information (e.g., characteristics of the subjects and device position). The main focus of our platform is to emphasise data quality, which is essential for training efficient models.