Authors:
(1) Wenxuan Wang, The Chinese University of Hong Kong, Hong Kong, China;
(2) Haonan Bai, The Chinese University of Hong Kong, Hong Kong, China
(3) Jen-tse Huang, The Chinese University of Hong Kong, Hong Kong, China;
(4) Yuxuan Wan, The Chinese University of Hong Kong, Hong Kong, China;
(5) Youliang Yuan, The Chinese University of Hong Kong, Shenzhen Shenzhen, China
(6) Haoyi Qiu University of California, Los Angeles, Los Angeles, USA;
(7) Nanyun Peng, University of California, Los Angeles, Los Angeles, USA
(8) Michael Lyu, The Chinese University of Hong Kong, Hong Kong, China. Table of Links Abstract 1 Introduction 2 Background 3 Approach and Implementation 3.1 Seed Image Collection and 3.2 Neutral Prompt List Collection 3.3 Image Generation and 3.4 Properties Assessment 3.5 Bias Evaluation 4 Evaluation 4.1 Experimental Setup 4.2 RQ1: Effectiveness of BiasPainter 4.3 RQ2 - Validity of Identified Biases 4.4 RQ3 - Bias Mitigation 5 Threats to Validity 6 Related Work 7 Conclusion, Data Availability, and References 4.2 RQ1: Effectiveness of BiasPainter In this RQ, we investigate whether BiasPainter can effectively trigger and measure the social bias in image generation models. Image Bias. We input the (seed image, prompt) pairs and let image generation software products and models edit the seed image under different prompts. Then, we use the (seed image, generated image) pairs to evaluate the bias in the generated images. In particular, we adopt BiasPainter to calculate the image bias scores and we find a large number of generated images that are highly biased. We show some examples in Figure 1. Word Bias. We adopt BiasPainter to calculate the word bias score for each prompt based on image bias scores. For each model and each topic, we list the top three prompt words that are highly biased according to gender, age and race, respectively, in Table 3. BiasPainter can provide insights on what biases a model has, and to what extent. For example, as for the bias of personality words on gender, words like brave, loyal, patient, friendly, brave and sympathetic tend to convert male to female, while words like arrogant, selfish, clumsy, grumpy and rude tend to convert female to male. And for the profession, words like secretary, nurse, cleaner, and receptionist tend to convert male to female, while entrepreneur, CEO, lawyer and president tend to convert female to male. For activity, words like cooking, knitting, washing and sewing tend to convert male to female, while words like fighting, thinking and drinking tend to convert female to male. In addition, BiasPainter can visualize the distribution of the word bias score for all the prompt words. For example, we use BiasPainter to visualize the distribution of word bias scores on the profession in stable diffusion 1.5. As is shown in Figure 5, the model is more biased to younger rather than older, and more biased to lighter skin tone rather than darker skin tone. Model Bias. BiasPainter can also calculate the model bias scores to evaluate the fairness of each image generation model. Table 4 shows the results, where we can find that different models are biased at different levels and on different topics. For example, stablediffusion 2.1 is the most biased model on age and Pix2pix shows less bias on age and gender. This paper is available on arxiv under CC0 1.0 DEED license. Authors: (1) Wenxuan Wang, The Chinese University of Hong Kong, Hong Kong, China; (2) Haonan Bai, The Chinese University of Hong Kong, Hong Kong, China (3) Jen-tse Huang, The Chinese University of Hong Kong, Hong Kong, China; (4) Yuxuan Wan, The Chinese University of Hong Kong, Hong Kong, China; (5) Youliang Yuan, The Chinese University of Hong Kong, Shenzhen Shenzhen, China (6) Haoyi Qiu University of California, Los Angeles, Los Angeles, USA; (7) Nanyun Peng, University of California, Los Angeles, Los Angeles, USA (8) Michael Lyu, The Chinese University of Hong Kong, Hong Kong, China. Authors: Authors: (1) Wenxuan Wang, The Chinese University of Hong Kong, Hong Kong, China; (2) Haonan Bai, The Chinese University of Hong Kong, Hong Kong, China (3) Jen-tse Huang, The Chinese University of Hong Kong, Hong Kong, China; (4) Yuxuan Wan, The Chinese University of Hong Kong, Hong Kong, China; (5) Youliang Yuan, The Chinese University of Hong Kong, Shenzhen Shenzhen, China (6) Haoyi Qiu University of California, Los Angeles, Los Angeles, USA; (7) Nanyun Peng, University of California, Los Angeles, Los Angeles, USA (8) Michael Lyu, The Chinese University of Hong Kong, Hong Kong, China. Table of Links Abstract Abstract 1 Introduction 1 Introduction 2 Background 2 Background 3 Approach and Implementation 3 Approach and Implementation 3.1 Seed Image Collection and 3.2 Neutral Prompt List Collection 3.1 Seed Image Collection and 3.2 Neutral Prompt List Collection 3.3 Image Generation and 3.4 Properties Assessment 3.3 Image Generation and 3.4 Properties Assessment 3.5 Bias Evaluation 3.5 Bias Evaluation 4 Evaluation 4 Evaluation 4.1 Experimental Setup 4.1 Experimental Setup 4.2 RQ1: Effectiveness of BiasPainter 4.2 RQ1: Effectiveness of BiasPainter 4.3 RQ2 - Validity of Identified Biases 4.3 RQ2 - Validity of Identified Biases 4.4 RQ3 - Bias Mitigation 4.4 RQ3 - Bias Mitigation 5 Threats to Validity 5 Threats to Validity 6 Related Work 6 Related Work 7 Conclusion, Data Availability, and References 7 Conclusion, Data Availability, and References 4.2 RQ1: Effectiveness of BiasPainter In this RQ, we investigate whether BiasPainter can effectively trigger and measure the social bias in image generation models. Image Bias. We input the (seed image, prompt) pairs and let image generation software products and models edit the seed image under different prompts. Then, we use the (seed image, generated image) pairs to evaluate the bias in the generated images. In particular, we adopt BiasPainter to calculate the image bias scores and we find a large number of generated images that are highly biased. We show some examples in Figure 1. Image Bias. Word Bias. We adopt BiasPainter to calculate the word bias score for each prompt based on image bias scores. For each model and each topic, we list the top three prompt words that are highly biased according to gender, age and race, respectively, in Table 3. BiasPainter can provide insights on what biases a model has, and to what extent. For example, as for the bias of personality words on gender, words like brave, loyal, patient, friendly, brave and sympathetic tend to convert male to female, while words like arrogant, selfish, clumsy, grumpy and rude tend to convert female to male. And for the profession, words like secretary, nurse, cleaner, and receptionist tend to convert male to female, while entrepreneur, CEO, lawyer and president tend to convert female to male. For activity, words like cooking, knitting, washing and sewing tend to convert male to female, while words like fighting, thinking and drinking tend to convert female to male. Word Bias. In addition, BiasPainter can visualize the distribution of the word bias score for all the prompt words. For example, we use BiasPainter to visualize the distribution of word bias scores on the profession in stable diffusion 1.5. As is shown in Figure 5, the model is more biased to younger rather than older, and more biased to lighter skin tone rather than darker skin tone. Model Bias. BiasPainter can also calculate the model bias scores to evaluate the fairness of each image generation model. Table 4 shows the results, where we can find that different models are biased at different levels and on different topics. For example, stablediffusion 2.1 is the most biased model on age and Pix2pix shows less bias on age and gender. Model Bias. This paper is available on arxiv under CC0 1.0 DEED license. This paper is available on arxiv under CC0 1.0 DEED license. available on arxiv

Part of HackerNoon's growing list of open-source research papers, promoting free access to academic material.

How Well Does BiasPainter Uncover Hidden Biases in Image Generation?

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

A Comprehensive Overview of Image Generation Models: From GANs to Diffusion Techniques

Pay attention to that man behind the curtain

AI Is Inherently Neutral - It Is Human Beings Who Are Biased, and the Machines Merely Replicate Them

AI Is Not the Concern - It’s AI Developers You Should Be Worried About

Are Smart Cities a Threat to Data Privacy?

Can AI Ever Overcome Built-In Human Biases?

A Comprehensive Overview of Image Generation Models: From GANs to Diffusion Techniques

Pay attention to that man behind the curtain

AI Is Inherently Neutral - It Is Human Beings Who Are Biased, and the Machines Merely Replicate Them

AI Is Not the Concern - It’s AI Developers You Should Be Worried About

Are Smart Cities a Threat to Data Privacy?

Can AI Ever Overcome Built-In Human Biases?

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps