This work quantifies the extent to which accuracy degrades on review classification when state-of-the-art Transformer models are subjected to distribution shifts, and offers a solution to significantly decrease this degradation. We find differences in the extent of degradation depending on the independent variable across which the shift is created. Specifically, in our experiments time and sentiment shifts show upto 10% drops in accuracy; whereas shifts between industry and product sectors show 20-40% drops in accuracy. We provide ablation experiments with different Transformer architectures, such as BERT, T5 and Jurassic-I, and study their relationship with this degradation. The suggested solution reuses the base of the model trained on one distribution, in addition to fine-tuning the final dense layer in the model to support the new distribution that is seen once the model is deployed. This uses just 100-300 samples compared to the previous 10,000 samples from the unseen distribution, while decreasing the accuracy drops in half.
Sehaj Chawla (Harvard University)
Nikhil Singh (Massachusetts Institute of Technology)
Iddo Drori (Columbia University and Cornell University)
Iddo Drori is a visiting Associate Professor in the School of Operations Research and Information Engineering at Cornell University and adjunct Associate Professor in the Department of Computer Science at Columbia University. Between 2017-2019 he was a research scientist and adjunct Professor at NYU Center for Data Science and Courant Institute while teaching at Columbia University and NYU Tandon. Between 2016-2017 he was a senior lecturer in Computer Science at Colman and lecturer at Tel Aviv University. He did his MSc and BSc in Computer Science and Mathematics at the Hebrew University with honors in the special program for outstanding students, his PhD in Computer Science at Tel Aviv University, and post-doc in Statistics at Stanford University. In the past year he has published 13 new publications in automated machine learning; meta-learning for graph algorithms; synthesizing language, vision, and audio; and protein structure prediction. He also served on five program committees in the past year. In the past two years he taught 13 courses on Deep Learning, Data Science, Machine Learning, and Optimization. Iddo has industry experience, and between 2011-2016 founded and served as CEO of a data science start-up acquired in 2017, after working as a research scientist for companies acquired by Anaplan, Daz3D, and Apple. Iddo enjoys teaching and received awards for mentoring the best capstone project at Colman, best teaching evaluations at Tel Aviv University, and mentored the winning teams in the ICCV 2019 Learning to Drive Challenge in the Deep Learning course at Columbia University. He also spends his time writing a forthcoming book titled The Science of Deep Learning to be published by Cambridge University Press.
More from the Same Authors
2020 : Paper 11: Vehicle Trajectory Prediction by Transfer Learning of Semi-Supervised Models »
Nick Lamm · Iddo Drori
2018 : Contributed Work »
Thaer Moustafa Dieb · Aditya Balu · Amir H. Khasahmadi · Viraj Shah · Boris Knyazev · Payel Das · Garrett Goh · Georgy Derevyanko · Gianni De Fabritiis · Reiko Hagawa · John Ingraham · David Belanger · Jialin Song · Kim Nicoli · Miha Skalic · Michelle Wu · Niklas Gebauer · Peter Bjørn Jørgensen · Ryan-Rhys Griffiths · Shengchao Liu · Sheshera Mysore · Hai Leong Chieu · Philippe Schwaller · Bart Olsthoorn · Bianca-Cristina Cristescu · Wei-Cheng Tseng · Seongok Ryu · Iddo Drori · Kevin Yang · Soumya Sanyal · Zois Boukouvalas · Rishi Bedi · Arindam Paul · Sambuddha Ghosal · Daniil Bash · Clyde Fare · Zekun Ren · Ali Oskooei · Minn Xuan Wong · Paul Sinz · Théophile Gaudin · Wengong Jin · Paul Leu