Fairness Sample Complexity and the Case for Human Intervention


With the aim of building machine learning systems that incorporate standards of fairness and accountability, we explore explicit subgroup sample complexity bounds. The work is motivated by the observation that classifier predictions for real world datasets often demonstrate drastically different metrics, such as accuracy, when subdivided by specific sensitive variable subgroups. The reasons for these discrepancies are varied and not limited to the influence of mitigating variables, institutional bias, underlying population distributions as well as selection bias. Among the numerous definitions of fairness that exist, we argue that at a minimum, principled ML practices should ensure that classification predictions are able to mirror the underlying sub-population distributions as a prelude to bias mitigation, and not amplify discrepancies due to sampling/selection bias. However, as the number of sensitive variables grow, populations meeting at the intersectionality of these variables may simply not exist or be large enough to accurately sample from. In theses increasingly likely scenarios, the case for human intervention and applying situational and individual definitions of fairness should be made.. In this paper we explore, setting Pareto-efficient subgroup sample complexity lower bounds based on the complexity of the ML classifier using VC dimension and Rademacher complexity. We demonstrate that for a classifier to approach a definition of fairness in terms of specific sensitive variables, adequate subgroup population samples need to exist and the model dimensionality has to be aligned with subgroup population distributions. In cases where this is not feasible human intervention is explored. We look at two commonly explored UCI datasets under this lens.