Abstract
We develop and study multicalibration as a new measure of fairness in machine learning that aims to mitigate inadvertent or malicious discrimination that is introduced at training time (even fro ground truth data). Multicalibration guarantees meaningful (calibrated) predictions for every subpopulation that can be identified within a specifi class of computations. The specified class can be quite rich; in particular, it can contain many overlapping subgroups of a protected group. We demonstrate that in many settings this strong notion of protection from discrimination is provably attainable and aligned with the goal of accurate predictions. Along the way, we present algorithms for learning a multicalibrated predictor, study th computational complexity of this task, and illustrate tight connections to the agnostic learning model.