Abstract
We present an algorithm STRSAGA that can efficiently maintain a machine learning model over data points that arrive over time, and quickly update the model as new training data are observed. We present a competitive analysis that compares the suboptimality of the model maintained by STRSAGA with that of an offline algorithm that is given the entire data beforehand. Our theoretical and experimental results show that the risk of STRSAGA is comparable to that of an offline algorithm on a variety of input arrival patterns, and its experimental performance is significantly better than prior algorithms suited for streaming data, such as SGD and SSVRG.