Timezone: »
Training Transformers Together
Alexander Borzunov · Max Ryabinin · Tim Dettmers · quentin lhoest · Lucile Saulnier · Michael Diskin · Yacine Jernite · Thomas Wolf
Event URL: https://training-transformers-together.github.io/ »
We invite volunteers to train a large Transformer language model over the Internet. Instead of using supercomputers, we will pool together all available computational resources: desktops, laptops, servers and even cloud TPUs from around the world. All training artifacts, such as model checkpoint and optimizer states, will be shared online for public use.
For this demonstration, we will provide an open-source starter kit that volunteers can use to join the global distributed training run and host similar experiments independently in the future.
Author Information
Alexander Borzunov (HSE University, Yandex)
Max Ryabinin (Yandex, Higher School of Economics)
Tim Dettmers (University of Washington)
quentin lhoest (Hugging Face)
Lucile Saulnier (Hugging Face)
Michael Diskin (Yandex, Higher School of Economics)
Yacine Jernite (Facebook FAIR NYC)
Thomas Wolf (🤗 Hugging Face)
More from the Same Authors
-
2021 Poster: Distributed Deep Learning In Open Collaborations »
Michael Diskin · Alexey Bukhtiyarov · Max Ryabinin · Lucile Saulnier · quentin lhoest · Anton Sinitsin · Dmitry Popov · Dmitry V. Pyrkin · Maxim Kashirin · Alexander Borzunov · Albert Villanova del Moral · Denis Mazur · Ilia Kobelev · Yacine Jernite · Thomas Wolf · Gennady Pekhimenko -
2021 Poster: Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices »
Max Ryabinin · Eduard Gorbunov · Vsevolod Plokhotnyuk · Gennady Pekhimenko -
2021 Poster: Scaling Ensemble Distribution Distillation to Many Classes with Proxy Targets »
Max Ryabinin · Andrey Malinin · Mark Gales -
2020 Poster: Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts »
Max Ryabinin · Anton Gusev -
2020 Poster: Movement Pruning: Adaptive Sparsity by Fine-Tuning »
Victor Sanh · Thomas Wolf · Alexander Rush -
2020 : An introduction to transfer learning in NLP and HuggingFace »
Thomas Wolf -
2013 Poster: Discovering Hidden Variables in Noisy-Or Networks using Quartet Tests »
Yacine Jernite · Yoni Halpern · David Sontag