Timezone: »

Improved Schemes for Episodic Memory-based Lifelong Learning
Yunhui Guo · Mingrui Liu · Tianbao Yang · Tajana S Rosing

Wed Dec 09 07:00 PM -- 07:10 PM (PST) @ Orals & Spotlights: Graph/Meta Learning/Software

Current deep neural networks can achieve remarkable performance on a single task. However, when the deep neural network is continually trained on a sequence of tasks, it seems to gradually forget the previous learned knowledge. This phenomenon is referred to as catastrophic forgetting and motivates the field called lifelong learning. Recently, episodic memory based approaches such as GEM and A-GEM have shown remarkable performance. In this paper, we provide the first unified view of episodic memory based approaches from an optimization's perspective. This view leads to two improved schemes for episodic memory based lifelong learning, called MEGA-\rom{1} and MEGA-\rom{2}. MEGA-\rom{1} and MEGA-\rom{2} modulate the balance between old tasks and the new task by integrating the current gradient with the gradient computed on the episodic memory. Notably, we show that GEM and A-GEM are degenerate cases of MEGA-\rom{1} and MEGA-\rom{2} which consistently put the same emphasis on the current task, regardless of how the loss changes over time. Our proposed schemes address this issue by using novel loss-balancing updating rules, which drastically improve the performance over GEM and A-GEM. Extensive experimental results show that the proposed schemes significantly advance the state-of-the-art on four commonly used lifelong learning benchmarks, reducing the error by up to 18%.

Author Information

Yunhui Guo (University of California, San Diego)
Mingrui Liu (Boston University)
Tianbao Yang (The University of Iowa)
Tajana S Rosing (UCSD)

Tajana Šimunić Rosing is a Professor, a holder of the Fratamico Endowed Chair, IEEE Fellow, and a director of System Energy Efficiency Lab at UCSD. Her research interests are in energy efficient computing, cyber-physical and distributed systems. She is leading a number of projects, including efforts funded by DARPA/SRC JUMP CRISP program, with focus on design of accelerators for analysis of big data, a project focused on developing AI systems in support of healthy living, SRC funded project on IoT system reliability and maintainability, and NSF funded project on design and calibration of air-quality sensors and others. She recently headed the effort on SmartCities that was a part of DARPA and industry funded TerraSwarm center. Tajana led the energy efficient datacenters theme in MuSyC center, and a number of large projects funded by both industry and government focused on power and thermal management. Tajana’s work on proactive thermal management and ambient-driven thermal modeling was instrumental in laying the groundwork in this field, and has since resulted in a number of industrial implementations of these ideas. Her research on event driven dynamic power management laid the mathematical foundations for the engineering problem, devised a globally optimal solution and more importantly defined the framework for future researchers to approach these kinds of problems in embedded system design. From 1998 until 2005 she was a full time research scientist at HP Labs while also leading research efforts at Stanford University. She finished her PhD in EE in 2001 at Stanford, concurrently with finishing her Masters in Engineering Management. Her PhD topic was dynamic management of power consumption. Prior to pursuing the PhD, she worked as a senior design engineer at Altera Corporation. She has served at a number of Technical Paper Committees, including being an Associate Editor of IEEE Transactions on Mobile Computing, an Associate Editor of IEEE Transactions on Circuits and Systems, and a Guest Editor for the Special Issue of IEEE Transactions on VLSI.

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors