A supervised machine learning algorithm determines a model from a learning sample that will be used to predict new observations. To this end, it aggregates individual characteristics of the observations of the learning sample. But this information aggregation does not consider any potential selection on unobservables and any status quo biases which may be contained in the training sample. The latter bias has raised concerns around the so-called fairness of machine learning algorithms, especially towards disadvantaged groups. In this chapter, we review the issue of fairness in machine learning through the lenses of structural econometrics models in which the unknown index is the solution of a functional equation and issues of endogeneity are explicitly taken into account. We model fairness as a linear operator whose null space contains the set of strictly fair indexes. A fair solution is obtained by projecting the unconstrained index into the null space of this operator or by directly finding the closest solution of the functional equation into this null space.We also acknowledge that policymakers may incur costs when moving away from the status quo. Approximate fairness is thus introduced as an intermediate set-up between the status quo and a fully fair solution via a fairness-specific penalty in the objective function of the learning model.
Samuele Centorrino, Jean-Pierre Florens et Jean-Michel Loubes, « Fairness in Machine Learning and Econometrics », dans Econometrics with Machine Learning, sous la direction de Felix Chan et Laszlo Matyas, Springer, chapitre 7, 2022, p. 217–250.
Econometrics with Machine Learning, sous la direction de Felix Chan et Laszlo Matyas, Springer, chapitre 7, 2022, p. 217–250