Timezone: »
We demonstrate how to improve the zero-shot and few-shot performance of large language models (LLMs) by using the T-Few parameter-efficient fine-tuning method (Liu et al., 2022) with self-training or co-training. Our methods apply to settings where labeled data is very limited, but unlabeled data is plentiful. Specifically, we combine T-Few with (i) the co-training techniques of Lang et al. (2022a), and (ii) SETRED, a self-training algorithm that uses a very simple data selection criterion (Li and Zhou, 2005). By using the efficient T-Few method, we are able to scale co-training to larger models (from T0-3B to T0-11B) and cut down on wallclock training time, improving the zero-shot co-training results of Lang et al. 2022a). By performing multiple iterations of self- or co-training, we significantly improve over the few-shot performance of T-Few reported by Liu et al. (2022) without using any additional labeled data. Our methods are relatively fast (2.5 hours to self-train T0-11B on a single A100 80GB) and allow T0-11B to match the few-shot performance of models with an order of magnitude more parameters.
Author Information
Hunter Lang (MIT)
Monica Agrawal (Massachusetts Institute of Technology)
Yoon Kim (Massachusetts Institute of Technology)
David Sontag (MIT)
More from the Same Authors
-
2022 : Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors »
Thomas Hartvigsen · Swami Sankaranarayanan · Hamid Palangi · Yoon Kim · Marzyeh Ghassemi -
2022 : Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors »
Thomas Hartvigsen · Swami Sankaranarayanan · Hamid Palangi · Yoon Kim · Marzyeh Ghassemi -
2022 Poster: Falsification before Extrapolation in Causal Effect Estimation »
Zeshan M Hussain · Michael Oberst · Ming-Chieh Shih · David Sontag -
2022 Poster: Evaluating Robustness to Dataset Shift via Parametric Robustness Sets »
Nikolaj Thams · Michael Oberst · David Sontag -
2022 Poster: Training Subset Selection for Weak Supervision »
Hunter Lang · Aravindan Vijayaraghavan · David Sontag -
2022 Poster: ETAB: A Benchmark Suite for Visual Representation Learning in Echocardiography »
Ahmed M. Alaa · Anthony Philippakis · David Sontag -
2021 Poster: Finding Regions of Heterogeneity in Decision-Making via Expected Conditional Covariance »
Justin Lim · Christina Ji · Michael Oberst · Saul Blecker · Leora Horwitz · David Sontag -
2019 Poster: Using Statistics to Automate Stochastic Optimization »
Hunter Lang · Lin Xiao · Pengchuan Zhang -
2019 Poster: Understanding the Role of Momentum in Stochastic Gradient Methods »
Igor Gitman · Hunter Lang · Pengchuan Zhang · Lin Xiao -
2018 : TBC 13 »
David Sontag -
2018 Poster: Why Is My Classifier Discriminatory? »
Irene Chen · Fredrik Johansson · David Sontag -
2018 Spotlight: Why Is My Classifier Discriminatory? »
Irene Chen · Fredrik Johansson · David Sontag -
2017 : Invited Talk 4 »
David Sontag -
2017 Poster: Causal Effect Inference with Deep Latent-Variable Models »
Christos Louizos · Uri Shalit · Joris Mooij · David Sontag · Richard Zemel · Max Welling -
2015 Workshop: Machine Learning For Healthcare (MLHC) »
Theofanis Karaletsos · Rajesh Ranganath · Suchi Saria · David Sontag -
2015 Poster: Barrier Frank-Wolfe for Marginal Inference »
Rahul G Krishnan · Simon Lacoste-Julien · David Sontag -
2013 Poster: Discovering Hidden Variables in Noisy-Or Networks using Quartet Tests »
Yacine Jernite · Yoni Halpern · David Sontag -
2011 Poster: Complexity of Inference in Latent Dirichlet Allocation »
David Sontag · Daniel Roy -
2011 Spotlight: Complexity of Inference in Latent Dirichlet Allocation »
David Sontag · Daniel Roy -
2010 Spotlight: More data means less inference: A pseudo-max approach to structured learning »
David Sontag · Ofer Meshi · Tommi Jaakkola · Amir Globerson -
2010 Poster: More data means less inference: A pseudo-max approach to structured learning »
David Sontag · Ofer Meshi · Tommi Jaakkola · Amir Globerson -
2009 Workshop: Approximate Learning of Large Scale Graphical Models »
Russ Salakhutdinov · Amir Globerson · David Sontag -
2008 Workshop: Approximate inference - how far have we come? »
Amir Globerson · David Sontag · Tommi Jaakkola -
2008 Poster: Clusters and Coarse Partitions in LP Relaxations »
David Sontag · Amir Globerson · Tommi Jaakkola -
2008 Spotlight: Clusters and Coarse Partitions in LP Relaxations »
David Sontag · Amir Globerson · Tommi Jaakkola -
2007 Oral: New Outer Bounds on the Marginal Polytope »
David Sontag · Tommi Jaakkola -
2007 Poster: New Outer Bounds on the Marginal Polytope »
David Sontag · Tommi Jaakkola