Abstract
Variational Gaussian (VG) inference methods that optimize a lower bound to the marginal likelihood are a popular approach for Bayesian inference. A difficulty remains in computation of the lower bound when the latent dimensionality L is large. Even though the lower bound is concave for many models, its computation requires optimization over variational parameters. Efficient reparameterization schemes can reduce the number of parameters, but give inaccurate solutions or destroy concavity leading to slow convergence. We propose decoupled variational inference that brings the best of both worlds together. First, it maximizes a Lagrangian of the lower bound reducing the number of parameters to O(N ), where N is the number of data examples. The reparameterization obtained is unique and recovers maxima of the lower-bound even when it is not concave. Second, our method maximizes the lower bound using a sequence of convex problems, each of which is parallellizable over data examples. Each gradient computation reduces to prediction in a pseudo linear regression model, thereby avoiding all direct computations of the covariance and only requiring its linear projections. Theoretically, our method converges at the same rate as existing methods in the case of concave lower bounds, while remaining convergent at a reasonable rate for the non-concave case.