We present an algorithm STRSAGA for efficiently maintaining a machine learning model over data points that arrive over time, quickly updating the model as new training data is observed. We present a competitive analysis comparing the sub-optimality of the model maintained by STRSAGA with that of an offline algorithm that is given the entire data beforehand, and analyze the risk-competitiveness of STRSAGA under different arrival patterns. Our theoretical and experimental results show that the risk of STRSAGA is comparable to that of offline algorithms on a variety of input arrival patterns, and its experimental performance is significantly better than prior algorithms suited for streaming data, such as SGD and SSVRG.
Ellango Jothimurugesan (CMU)
Ashraf Tahmasbi (Iowa State University)
Phillip Gibbons (CMU)
Srikanta Tirthapura (Iowa State University)
More from the Same Authors
2022 : Federated Learning under Distributed Concept Drift »
Ellango Jothimurugesan · Kevin Hsieh · Jianyu Wang · Gauri Joshi · Phillip Gibbons