paint-brush
How Well Does BiasPainter Uncover Hidden Biases in Image Generation?by@mediabias
109 reads

How Well Does BiasPainter Uncover Hidden Biases in Image Generation?

tldt arrow

Too Long; Didn't Read

BiasPainter effectively detects social bias in image generation models by calculating image bias scores and analyzing word bias. Results show varied bias levels across models, with insights into gender, age, and race biases. For instance, Stable Diffusion 2.1 exhibits notable age bias, while InstructPix2Pix shows less bias overall.
featured image - How Well Does BiasPainter Uncover Hidden Biases in Image Generation?
Tech Media Bias [Research Publication] HackerNoon profile picture

Authors:

(1) Wenxuan Wang, The Chinese University of Hong Kong, Hong Kong, China;

(2) Haonan Bai, The Chinese University of Hong Kong, Hong Kong, China

(3) Jen-tse Huang, The Chinese University of Hong Kong, Hong Kong, China;

(4) Yuxuan Wan, The Chinese University of Hong Kong, Hong Kong, China;

(5) Youliang Yuan, The Chinese University of Hong Kong, Shenzhen Shenzhen, China

(6) Haoyi Qiu University of California, Los Angeles, Los Angeles, USA;

(7) Nanyun Peng, University of California, Los Angeles, Los Angeles, USA

(8) Michael Lyu, The Chinese University of Hong Kong, Hong Kong, China.

Abstract

1 Introduction

2 Background

3 Approach and Implementation

3.1 Seed Image Collection and 3.2 Neutral Prompt List Collection

3.3 Image Generation and 3.4 Properties Assessment

3.5 Bias Evaluation

4 Evaluation

4.1 Experimental Setup

4.2 RQ1: Effectiveness of BiasPainter

4.3 RQ2 - Validity of Identified Biases

4.4 RQ3 - Bias Mitigation

5 Threats to Validity

6 Related Work

7 Conclusion, Data Availability, and References

4.2 RQ1: Effectiveness of BiasPainter

In this RQ, we investigate whether BiasPainter can effectively trigger and measure the social bias in image generation models.


Image Bias. We input the (seed image, prompt) pairs and let image generation software products and models edit the seed image under different prompts. Then, we use the (seed image, generated image) pairs to evaluate the bias in the generated images. In particular, we adopt BiasPainter to calculate the image bias scores and we find a large number of generated images that are highly biased. We show some examples in Figure 1.


Word Bias. We adopt BiasPainter to calculate the word bias score for each prompt based on image bias scores. For each model and each topic, we list the top three prompt words that are highly biased according to gender, age and race, respectively, in Table 3. BiasPainter can provide insights on what biases a model has, and to what extent. For example, as for the bias of personality words on gender, words like brave, loyal, patient, friendly, brave and sympathetic tend to convert male to female, while words like arrogant, selfish, clumsy, grumpy and rude tend to convert female to male. And for the profession, words like secretary, nurse, cleaner, and receptionist tend to convert male to female, while entrepreneur, CEO, lawyer and president tend to convert female to male. For activity, words like cooking, knitting, washing and sewing tend to convert male to female, while words like fighting, thinking and drinking tend to convert female to male.


In addition, BiasPainter can visualize the distribution of the word bias score for all the prompt words. For example, we use BiasPainter to visualize the distribution of word bias scores on the profession in stable diffusion 1.5. As is shown in Figure 5, the model is more biased to younger rather than older, and more biased to lighter skin tone rather than darker skin tone.


Figure 5: Visualization of Profession Word Bias Scores in Stable Diffusion 1.5


Model Bias. BiasPainter can also calculate the model bias scores to evaluate the fairness of each image generation model. Table 4 shows the results, where we can find that different models are biased at different levels and on different topics. For example, stablediffusion 2.1 is the most biased model on age and Pix2pix shows less bias on age and gender.




This paper is available on arxiv under CC0 1.0 DEED license.