Timezone: »

Ensemble Distillation for Robust Model Fusion in Federated Learning
Tao Lin · Lingjing Kong · Sebastian Stich · Martin Jaggi

Thu Dec 10 09:00 AM -- 11:00 AM (PST) @ Poster Session 5 #1653

Federated Learning (FL) is a machine learning setting where many devices collaboratively train a machine learning model while keeping the training data decentralized. In most of the current training schemes the central model is refined by averaging the parameters of the server model and the updated parameters from the client side. However, directly averaging model parameters is only possible if all models have the same structure and size, which could be a restrictive constraint in many scenarios.

In this work we investigate more powerful and more flexible aggregation schemes for FL. Specifically, we propose ensemble distillation for model fusion, i.e. training the central classifier through unlabeled data on the outputs of the models from the clients. This knowledge distillation technique mitigates privacy risk and cost to the same extent as the baseline FL algorithms, but allows flexible aggregation over heterogeneous client models that can differ e.g. in size, numerical precision or structure. We show in extensive empirical experiments on various CV/NLP datasets (CIFAR-10/100, ImageNet, AG News, SST2) and settings (heterogeneous models/data) that the server model can be trained much faster, requiring fewer communication rounds than any existing FL technique so far.

Author Information

Tao Lin (EPFL)
Lingjing Kong (EPFL)
Sebastian Stich (EPFL)

Dr. [Sebastian U. Stich](https://sstich.ch/) is a faculty at the CISPA Helmholtz Center for Information Security. Research interests: - *methods for machine learning and statistics*—at the interface of theory and practice - *collaborative learning* (distributed, federated and decentralized methods) - *optimization for machine learning* (adaptive stochastic methods and generalization performance)

Martin Jaggi (EPFL)

More from the Same Authors