Abstract
In regression problems involving vector-valued outputs(or equivalently,multiple responses),it is well known that the maximum likelihood estimator(MLE),which takes noise covariance structure into account,can be significantly more accurate than the ordinary least squares(OLS)estimator.However,existing literature com-pares OLS and MLE in terms of their asymptotic,not finite sample,guarantees.More crucially,computing the MLE in general requires solving a non-convex op-timization problem and is not known to be efficiently solvable.We provide fi-nite sample upper and lower bounds on the estimation error of OLS and MLE,in two popular models:a)Pooled model,b)Scemingly Unrelatcd Regression(SUR)model.We provide precise instances whcre the MLE is significantly more ac-curatc than OLS.Furthcrmorc,for both modcls,wc show that the output of a computationally efficient alternating minimization procedure enjoys the same per-formance guarantee as MLE,up to universal constants.Finally,we show that for high-dimensional settings as well,the alternating minimization procedure leads to significantly more accurate solutions than the corresponding OLS solutions but with error bound that depends only logarithmically on the data dimensionality.