Timezone: »
Few ideas have enjoyed as large an impact on deep learning as convolution. For any problem involving pixels or spatial representations, common intuition holds that convolutional neural networks may be appropriate. In this paper we show a striking counterexample to this intuition via the seemingly trivial coordinate transform problem, which simply requires learning a mapping between coordinates in (x,y) Cartesian space and coordinates in one-hot pixel space. Although convolutional networks would seem appropriate for this task, we show that they fail spectacularly. We demonstrate and carefully analyze the failure first on a toy problem, at which point a simple fix becomes obvious. We call this solution CoordConv, which works by giving convolution access to its own input coordinates through the use of extra coordinate channels. Without sacrificing the computational and parametric efficiency of ordinary convolution, CoordConv allows networks to learn either complete translation invariance or varying degrees of translation dependence, as required by the end task. CoordConv solves the coordinate transform problem with perfect generalization and 150 times faster with 10--100 times fewer parameters than convolution. This stark contrast raises the question: to what extent has this inability of convolution persisted insidiously inside other tasks, subtly hampering performance from within? A complete answer to this question will require further investigation, but we show preliminary evidence that swapping convolution for CoordConv can improve models on a diverse set of tasks. Using CoordConv in a GAN produced less mode collapse as the transform between high-level spatial latents and pixels becomes easier to learn. A Faster R-CNN detection model trained on MNIST detection showed 24% better IOU when using CoordConv, and in the Reinforcement Learning (RL) domain agents playing Atari games benefit significantly from the use of CoordConv layers.
Author Information
Rosanne Liu (Uber AI Labs)
Joel Lehman (Uber AI Labs)
Piero Molino (Uber AI Labs)
Felipe Petroski Such (Uber AI Labs)
Eric Frank (Uber AI Labs)
Alex Sergeev (Uber Technologies Inc,)
Jason Yosinski (Uber AI Labs; Recursion)
Dr. Jason Yosinski is a machine learning researcher, was a founding member of Uber AI Labs, and is scientific adviser to Recursion Pharmaceuticals and several other companies. His work focuses on building more capable and more understandable AI. As scientists and engineers build increasingly powerful AI systems, the abilities of these systems increase faster than does our understanding of them, motivating much of his work on AI Neuroscience: an emerging field of study that investigates fundamental properties and behaviors of AI systems. Dr. Yosinski completed his PhD as a NASA Space Technology Research Fellow working at the Cornell Creative Machines Lab, the University of Montreal, Caltech/NASA Jet Propulsion Laboratory, and Google DeepMind. His work on AI has been featured on NPR, Fast Company, the Economist, TEDx, XKCD, and on the BBC. Prior to his academic career, Jason cofounded two web technology companies and started a program in the Los Angeles school district that teaches students algebra via hands-on robotics. In his free time, Jason enjoys cooking, sailing, motorcycling, reading, paragliding, and sometimes pretending he's an artist.
More from the Same Authors
-
2020 Poster: Supermasks in Superposition »
Mitchell Wortsman · Vivek Ramanujan · Rosanne Liu · Aniruddha Kembhavi · Mohammad Rastegari · Jason Yosinski · Ali Farhadi -
2019 : Panel - The Role of Communication at Large: Aparna Lakshmiratan, Jason Yosinski, Been Kim, Surya Ganguli, Finale Doshi-Velez »
Aparna Lakshmiratan · Finale Doshi-Velez · Surya Ganguli · Zachary Lipton · Michela Paganini · Anima Anandkumar · Jason Yosinski -
2019 Workshop: Retrospectives: A Venue for Self-Reflection in ML Research »
Ryan Lowe · Yoshua Bengio · Joelle Pineau · Michela Paganini · Jessica Forde · Shagun Sodhani · Abhishek Gupta · Joel Lehman · Peter Henderson · Kanika Madan · Koustuv Sinha · Xavier Bouthillier -
2019 Poster: Hamiltonian Neural Networks »
Sam Greydanus · Misko Dzamba · Jason Yosinski -
2019 Poster: LCA: Loss Change Allocation for Neural Network Training »
Janice Lan · Rosanne Liu · Hattie Zhou · Jason Yosinski -
2019 Poster: Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask »
Hattie Zhou · Janice Lan · Rosanne Liu · Jason Yosinski -
2018 : Jason Yosinski, "Good and bad assumptions in model design and interpretability" »
Jason Yosinski -
2018 Poster: Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents »
Edoardo Conti · Vashisht Madhavan · Felipe Petroski Such · Joel Lehman · Kenneth Stanley · Jeff Clune -
2018 Poster: Faster Neural Networks Straight from JPEG »
Lionel Gueguen · Alex Sergeev · Ben Kadlec · Rosanne Liu · Jason Yosinski -
2017 Symposium: Interpretable Machine Learning »
Andrew Wilson · Jason Yosinski · Patrice Simard · Rich Caruana · William Herlands -
2017 Poster: SVCCA: Singular Vector Canonical Correlation Analysis for Deep Learning Dynamics and Interpretability »
Maithra Raghu · Justin Gilmer · Jason Yosinski · Jascha Sohl-Dickstein -
2016 Demonstration: Adventures with Deep Generator Networks »
Jason Yosinski · Anh Nguyen · Jeff Clune · Douglas K Bemis -
2016 Poster: Synthesizing the preferred inputs for neurons in neural networks via deep generator networks »
Anh Nguyen · Alexey Dosovitskiy · Jason Yosinski · Thomas Brox · Jeff Clune -
2014 Poster: How transferable are features in deep neural networks? »
Jason Yosinski · Jeff Clune · Yoshua Bengio · Hod Lipson -
2014 Demonstration: Playing with Convnets »
Jason Yosinski · Hod Lipson -
2014 Oral: How transferable are features in deep neural networks? »
Jason Yosinski · Jeff Clune · Yoshua Bengio · Hod Lipson