Algorithmic bias indicates the discrimination caused by algorithms, which occurs with protected features such as gender and race. Even if we exclude a protected feature inducing the unfairness from the input data, the bias can still appear due to proxy discrimination through the dependency of other attributes and protected features. Several methods have been devised to reduce the bias, but it is not yet fully explored to identify the cause of this problem. In this paper, non-discriminated representation is formulated as a dual objective optimization problem of encoding data while obfuscating the information about the protected features in the data representation by exploiting the unbiased information bottleneck. Encoder learns data representation and discriminator judges whether there is information about the protected features in the data representation or not. They are trained simultaneously in adversarial fashion to achieve fair representation. Moreover, the algorithmic bias is analyzed in terms of bias-variance dilemma to reveal the cause of bias, so as to prove that the proposed method is effective for reducing the algorithmic bias in theory and experiments. Experiments with the well-known benchmark datasets such as Adults, Census, and COMPAS demonstrate the efficacy of the proposed method compared to other conventional techniques. Our method not only reduces the bias but also can use the latent representation in other classifiers (i.e., once a fair representation is learned, it can be used in various classifiers). We illustrate it by applying to the conventional machine learning models and visualizing the data representation with t-SNE algorithm.
|Number of pages
|CEUR Workshop Proceedings
|Published - 2020
|2020 Workshop on Artificial Intelligence Safety, SafeAI 2020 - New York, United States
Duration: 2020 Feb 7 → …
Bibliographical notePublisher Copyright:
© 2020 for this paper by its authors.
All Science Journal Classification (ASJC) codes
- General Computer Science