3 Preliminaries
3.1 Fair Supervised Learning and 3.2 Fairness Criteria
3.3 Dependence Measures for Fair Supervised Learning
4 Inductive Biases of DP-based Fair Supervised Learning
4.1 Extending the Theoretical Results to Randomized Prediction Rule
5 A Distributionally Robust Optimization Approach to DP-based Fair Learning
6 Numerical Results
6.2 Inductive Biases of Models trained in DP-based Fair Learning
6.3 DP-based Fair Classification in Heterogeneous Federated Learning
Appendix B Additional Results for Image Dataset
Fair supervised learning algorithms assigning labels with little dependence on a sensitive attribute have attracted great attention in the machine learning community. While the demographic parity (DP) notion has been frequently used to measure a model’s fairness in training fair classifiers, several studies in the literature suggest potential impacts of enforcing DP in fair learning algorithms. In this work, we analytically study the effect of standard DP-based regularization methods on the conditional distribution of the predicted label given the sensitive attribute. Our analysis shows that an imbalanced training dataset with a non-uniform distribution of the sensitive attribute could lead to a classification rule biased toward the sensitive attribute outcome holding the majority of training data. To control such inductive biases in DP-based fair learning, we propose a sensitive attribute-based distributionally robust optimization (SA-DRO) method improving robustness against the marginal distribution of the sensitive attribute. Finally, we present several numerical results on the application of DP-based learning methods to standard centralized and distributed learning problems. The empirical findings support our theoretical results on the inductive biases in DP-based fair learning algorithms and the debiasing effects of the proposed SA-DRO method.
A responsible deployment of modern machine learning frameworks in high-stake decision-making tasks requires mechanisms for controlling the dependence of their output on sensitive attributes such as gender and ethnicity. A supervised learning framework with no control on the dependence of the prediction on the input features could lead to discriminatory decisions that significantly correlate with the sensitive attributes. Due to the critical importance of the fairness factor in several machine learning applications, the study and development of fair statistical learning algorithms have received great attention in the literature.
To reduce the biases of DP-based learning algorithms, we propose a sensitive attribute-based distributionally robust optimization (SA-DRO) method where the fair learner minimizes the worst-case DP-regularized loss over a set of sensitive attribute marginal distributions centered around the data-based marginal distribution. As a result, the SA-DRO approach can account for different frequencies of the sensitive attribute outcomes and thus offer a robust behavior to the changes in the sensitive attribute’s majority outcome.
We present the results of several numerical experiments on the potential biases of DDP-based fair classification methodologies to the sensitive attribute possessing the majority in the dataset. Our empirical findings are consistent with the theoretical results, suggesting the inductive biases of DP-based fair classification rules toward the sensitive attribute-based majority group. On the other hand, our results indicate that the DRO-SA-based fair learning method results in fair classification rules with a lower bias toward the label distribution under the majority sensitive attribute.
Furthermore, to show the impacts of such inductive biases in practice, we analyze the fair classification task in a federated learning context where multiple clients attempt to train a decentralized model. We focus on a setting with heterogeneous sensitive attribute distributions across clients where the clients’ majority sensitive attribute outcome may not agree. Figure 1 illustrates such a federated learning scenario over the Adult dataset, where Client 1’s majority sensitive attribute (female samples) is different from the network’s majority group (male samples), and consequently, Client 1’s test accuracy with a DP-based fair federated learning is significantly lower than the test accuracy of a localized fair model trained only on Client 1’s data. Such numerical results question the client’s incentive in participating in fair federated learning. The following is a summary of this work’s main contributions:
• Analytically studying the biases of DP-based fair learning toward the majority sensitive attribute,
• Proposing a distributionally robust optimization method to lower the biases of DP-based fair classification,
• Providing numerical results on the biases of DP-based fair learning in centralized and federated learning scenarios.
Fairness Violation Metrics. In this work, we focus on the learning frameworks aiming toward demographic parity (DP). Since enforcing DP to strictly hold could be costly and damaging to the learner’s performance, the machine learning literature has proposed applying several metrics assessing the dependence between random variables, including: the mutual information: [3–7], Pearson correlation [8, 9], kernel-based maximum mean discrepancy: [10], kernel density estimation of the difference of demographic parity (DDP) measures [11], the maximal correlation [12–15], and the exponential Renyi mutual information [16]. In our analysis, we mostly focus on a DDP-based fair regularization scheme, while we show only weaker versions of the inductive biases could further hold in the case of mutual information and maximal correlation-based fair learning algorithms.
Fair Classification Algorithms. Fair machine learning algorithms can be classified into three main categories: pre-processing, post-processing, and in-processing. Pre-processing algorithms [17–19] transform biased data features into a new space where labels and sensitive attributes are statistically independent. Post-processing methods such as [2, 20] aim to alleviate the discriminatory impact of a classifier by modifying its ultimate decision. The focus of our work focus is only on in-processing approaches regularizing the training process toward DP-based fair models. Also, [21–23] propose distributionally robust optimization (DRO) for fair classification; however, unlike our method, these works do not apply DRO on the sensitive attribute distribution to reduce the biases.
simplfy
This paper is available on arxiv under CC BY-NC-SA 4.0 DEED license.
Authors:
(1) Haoyu LEI, Department of Computer Science and Engineering, The Chinese University of Hong Kong (hylei22@cse.cuhk.edu.hk);
(2) Amin Gohari, Department of Information Engineering, The Chinese University of Hong Kong (agohari@ie.cuhk.edu.hk);
(3) Farzan Farnia, Department of Computer Science and Engineering, The Chinese University of Hong Kong (farnia@cse.cuhk.edu.hk).