Timezone: »
Generative Adversarial Networks (GANs) enjoy great success at image generation, but have proven difficult to train in the domain of natural language. Challenges with gradient estimation, optimization instability, and mode collapse have lead practitioners to resort to maximum likelihood pre-training, followed by small amounts of adversarial fine-tuning. The benefits of GAN fine-tuning for language generation are unclear, as the resulting models produce comparable or worse samples than traditional language models. We show it is in fact possible to train a language GAN from scratch --- without maximum likelihood pre-training. We combine existing techniques such as large batch sizes, dense rewards and discriminator regularization to stabilize and improve language GANs. The resulting model, ScratchGAN, performs comparably to maximum likelihood training on EMNLP2017 News and WikiText-103 corpora according to quality and diversity metrics.
Author Information
Cyprien de Masson d'Autume (Google DeepMind)
Shakir Mohamed (DeepMind)

Shakir Mohamed is a senior staff scientist at DeepMind in London. Shakir's main interests lie at the intersection of approximate Bayesian inference, deep learning and reinforcement learning, and the role that machine learning systems at this intersection have in the development of more intelligent and general-purpose learning systems. Before moving to London, Shakir held a Junior Research Fellowship from the Canadian Institute for Advanced Research (CIFAR), based in Vancouver at the University of British Columbia with Nando de Freitas. Shakir completed his PhD with Zoubin Ghahramani at the University of Cambridge, where he was a Commonwealth Scholar to the United Kingdom. Shakir is from South Africa and completed his previous degrees in Electrical and Information Engineering at the University of the Witwatersrand, Johannesburg.
Mihaela Rosca (Google DeepMind)
Jack Rae (DeepMind, UCL)
More from the Same Authors
-
2021 Spotlight: Mind the Gap: Assessing Temporal Generalization in Neural Language Models »
Angeliki Lazaridou · Adhi Kuncoro · Elena Gribovskaya · Devang Agrawal · Adam Liska · Tayfun Terzi · Mai Gimenez · Cyprien de Masson d'Autume · Tomas Kocisky · Sebastian Ruder · Dani Yogatama · Kris Cao · Susannah Barlow · Phil Blunsom -
2022 : Advancing the participatory approach to AI in Mental Health »
Wilson Lee · Munmun De Choudhury · Morgan Scheuerman · Julia Hamer-Hunt · Dan Joyce · Nenad Tomasev · Kevin McKee · Shakir Mohamed · Danielle Belgrave · Christopher Burr -
2022 Poster: An empirical analysis of compute-optimal large language model training »
Jordan Hoffmann · Sebastian Borgeaud · Arthur Mensch · Elena Buchatskaya · Trevor Cai · Eliza Rutherford · Diego de Las Casas · Lisa Anne Hendricks · Johannes Welbl · Aidan Clark · Thomas Hennigan · Eric Noland · Katherine Millican · George van den Driessche · Bogdan Damoc · Aurelia Guy · Simon Osindero · Karén Simonyan · Erich Elsen · Oriol Vinyals · Jack Rae · Laurent Sifre -
2022 Poster: Why neural networks find simple solutions: The many regularizers of geometric complexity »
Benoit Dherin · Michael Munn · Mihaela Rosca · David Barrett -
2021 : Invited Talk 2 »
Mihaela Rosca -
2021 Poster: Mind the Gap: Assessing Temporal Generalization in Neural Language Models »
Angeliki Lazaridou · Adhi Kuncoro · Elena Gribovskaya · Devang Agrawal · Adam Liska · Tayfun Terzi · Mai Gimenez · Cyprien de Masson d'Autume · Tomas Kocisky · Sebastian Ruder · Dani Yogatama · Kris Cao · Susannah Barlow · Phil Blunsom -
2020 : Panel Discussions »
Grace Lindsay · George Konidaris · Shakir Mohamed · Kimberly Stachenfeld · Peter Dayan · Yael Niv · Doina Precup · Catherine Hartley · Ishita Dasgupta -
2020 : Invited talk 1 QnA: Shakir Mohamed »
Shakir Mohamed · Feryal Behbahani · Raymond Chua -
2020 : Invited Talk #1 Shakir Mohamed : Pain and Machine Learning »
Shakir Mohamed -
2020 : Q&A with Shakir »
Shakir Mohamed -
2020 : Invited: Shakir Mohamed »
Shakir Mohamed -
2020 Poster: Top-KAST: Top-K Always Sparse Training »
Siddhant Jayakumar · Razvan Pascanu · Jack Rae · Simon Osindero · Erich Elsen -
2020 : Women at DeepMind: Applying for technical roles »
Feryal Behbahani · Mihaela Rosca · Kate Parkyn -
2020 : Policy Panel »
Roya Pakzad · Dia Kayyali · Marzyeh Ghassemi · Shakir Mohamed · Mohammad Norouzi · Ted Pedersen · Anver Emon · Abubakar Abid · Darren Byler · Samhaa R. El-Beltagy · Nayel Shafei · Mona Diab -
2020 Affinity Workshop: Muslims in ML »
Marzyeh Ghassemi · Mohammad Norouzi · Shakir Mohamed · Aya Salama · Tasmie Sarker -
2019 : Poster session »
Sebastian Farquhar · Erik Daxberger · Andreas Look · Matt Benatan · Ruiyi Zhang · Marton Havasi · Fredrik Gustafsson · James A Brofos · Nabeel Seedat · Micha Livne · Ivan Ustyuzhaninov · Adam Cobb · Felix D McGregor · Patrick McClure · Tim R. Davidson · Gaurush Hiranandani · Sanjeev Arora · Masha Itkina · Didrik Nielsen · William Harvey · Matias Valdenegro-Toro · Stefano Peluchetti · Riccardo Moriconi · Tianyu Cui · Vaclav Smidl · Taylan Cemgil · Jack Fitzsimons · He Zhao · · mariana vargas vieyra · Apratim Bhattacharyya · Rahul Sharma · Geoffroy Dubourg-Felonneau · Jonathan Warrell · Slava Voloshynovskiy · Mihaela Rosca · Jiaming Song · Andrew Ross · Homa Fashandi · Ruiqi Gao · Hooshmand Shokri Razaghi · Joshua Chang · Zhenzhong Xiao · Vanessa Boehm · Giorgio Giannone · Ranganath Krishnan · Joe Davison · Arsenii Ashukha · Jeremiah Liu · Sicong (Sheldon) Huang · Evgenii Nikishin · Sunho Park · Nilesh Ahuja · Mahesh Subedar · · Artyom Gadetsky · Jhosimar Arias Figueroa · Tim G. J. Rudner · Waseem Aslam · Adrián Csiszárik · John Moberg · Ali Hebbal · Kathrin Grosse · Pekka Marttinen · Bang An · Hlynur Jónsson · Samuel Kessler · Abhishek Kumar · Mikhail Figurnov · Omesh Tickoo · Steindor Saemundsson · Ari Heljakka · Dániel Varga · Niklas Heim · Simone Rossi · Max Laves · Waseem Gharbieh · Nicholas Roberts · Luis Armando Pérez Rey · Matthew Willetts · Prithvijit Chakrabarty · Sumedh Ghaisas · Carl Shneider · Wray Buntine · Kamil Adamczewski · Xavier Gitiaux · Suwen Lin · Hao Fu · Gunnar Rätsch · Aidan Gomez · Erik Bodin · Dinh Phung · Lennart Svensson · Juliano Tusi Amaral Laganá Pinto · Milad Alizadeh · Jianzhun Du · Kevin Murphy · Beatrix Benkő · Shashaank Vattikuti · Jonathan Gordon · Christopher Kanan · Sontje Ihler · Darin Graham · Michael Teng · Louis Kirsch · Tomas Pevny · Taras Holotyak -
2019 Poster: Episodic Memory in Lifelong Language Learning »
Cyprien de Masson d'Autume · Sebastian Ruder · Lingpeng Kong · Dani Yogatama -
2018 Poster: Implicit Reparameterization Gradients »
Mikhail Figurnov · Shakir Mohamed · Andriy Mnih -
2018 Spotlight: Implicit Reparameterization Gradients »
Mikhail Figurnov · Shakir Mohamed · Andriy Mnih -
2018 Poster: Neural Arithmetic Logic Units »
Andrew Trask · Felix Hill · Scott Reed · Jack Rae · Chris Dyer · Phil Blunsom -
2018 Poster: Relational recurrent neural networks »
Adam Santoro · Ryan Faulkner · David Raposo · Jack Rae · Mike Chrzanowski · Theophane Weber · Daan Wierstra · Oriol Vinyals · Razvan Pascanu · Timothy Lillicrap -
2016 : Panel Discussion »
Shakir Mohamed · David Blei · Ryan Adams · José Miguel Hernández-Lobato · Ian Goodfellow · Yarin Gal -
2016 : Bayesian Agents: Bayesian Reasoning and Deep Learning in Agent-based Systems »
Shakir Mohamed -
2016 Poster: Unsupervised Learning of 3D Structure from Images »
Danilo Jimenez Rezende · S. M. Ali Eslami · Shakir Mohamed · Peter Battaglia · Max Jaderberg · Nicolas Heess -
2016 Tutorial: Variational Inference: Foundations and Modern Methods »
David Blei · Shakir Mohamed · Rajesh Ranganath -
2015 Workshop: Advances in Approximate Bayesian Inference »
Dustin Tran · Tamara Broderick · Stephan Mandt · James McInerney · Shakir Mohamed · Alp Kucukelbir · Matthew D. Hoffman · Neil Lawrence · David Blei -
2015 Poster: Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning »
Shakir Mohamed · Danilo Jimenez Rezende -
2014 Workshop: Advances in Variational Inference »
David Blei · Shakir Mohamed · Michael Jordan · Charles Blundell · Tamara Broderick · Matthew D. Hoffman -
2014 Poster: Semi-supervised Learning with Deep Generative Models »
Diederik Kingma · Shakir Mohamed · Danilo Jimenez Rezende · Max Welling -
2014 Spotlight: Semi-supervised Learning with Deep Generative Models »
Diederik Kingma · Shakir Mohamed · Danilo Jimenez Rezende · Max Welling -
2012 Workshop: Bayesian Optimization and Decision Making »
Javad Azimi · Roman Garnett · Frank R Hutter · Shakir Mohamed -
2012 Poster: Expectation Propagation in Gaussian Process Dynamical Systems »
Marc Deisenroth · Shakir Mohamed -
2012 Poster: Fast Bayesian Inference for Non-Conjugate Gaussian Process Regression »
Mohammad Emtiyaz Khan · Shakir Mohamed · Kevin Murphy -
2009 Poster: Large Scale Nonparametric Bayesian Inference: Data Parallelisation in the Indian Buffet Process »
Shakir Mohamed · David A Knowles · Zoubin Ghahramani · Finale P Doshi-Velez -
2008 Poster: Bayesian Exponential Family PCA »
Shakir Mohamed · Katherine Heller · Zoubin Ghahramani -
2008 Spotlight: Bayesian Exponential Family PCA »
Shakir Mohamed · Katherine Heller · Zoubin Ghahramani