Abstract
We develop a general accelerated proximal coordinate descent algorithm in distributed settings (DisAPCG) for the optimization problem that minimizes the sum of two convex functions: the first part f is smooth with a gradient oracle, and the other one ? is separable with respect to blocks of coordinate and has a simple known structure (e.g., L1 norm). Our algorithm gets new accelerated convergence rate in the case that f is strongly convex by making use of modern parallel structures, and includes previous non-strongly case as a special case. We further present efficient implementations to avoid full-dimensional operations in each step, significantly reducing the computation cost. Experiments on the regularized empirical risk minimization problem demonstrate the effectiveness of our algorithm and match our theoretical findings.