There has been a recent focus in reinforcement learning on addressing continuous state and action problems by optimizing parameterized policies. is a recent example of this approach. It combines a derivation from first principles of stochastic optimal control with tools from statistical estimation theory. In this paper, we consider as a member of the wider family of methods which share the concept of probability-weighted averaging to iteratively update parameters to optimize a cost function. We compare to other members of the same family – Cross-Entropy Methods and CMAES – at the conceptual level and in terms of performance. The comparison suggests the derivation of a novel algorithm which we call -CMA for “Path Integral Policy Improvement with Covariance Matrix Adaptation”. -CMA’s main advantage is that it determines the magnitude of the exploration noise automatically.