We establish an excess risk bound of for ERM with an H-smooth loss function and a hypothesis class with Rademacher complexity , where is the best risk achievable by the hypothesis class. For typical hypothesis classes where this translates to a learning rate of in the separable case and generally. We also provide similar guarantees for online and stochastic convex optimization of a smooth non-negative objective.