Abstract
By reducing optimization to a sequence of small subproblems, working set methods achieve fast convergence times for many challenging problems. Despite excellent performance, theoretical understanding of working sets is limited, and implementations often resort to heuristics to determine subproblem size, makeup, and stopping criteria. We propose B LITZ, a fast working set algorithm accompanied by useful guarantees. Making no assumptions on data, our theory relates subproblem size to progress toward convergence. This result motivates methods for optimizing algorithmic parameters and discarding irrelevant variables as iterations progress. Applied to -regularized learning, B LITZ convincingly outperforms existing solvers in sequential, limited-memory, and distributed settings. B LITZ is not specific to -regularized learning, making the algorithm relevant to many applications involving sparsity or constraints.