Authors:
(1) Wenxuan Wang, The Chinese University of Hong Kong, Hong Kong, China;
(2) Haonan Bai, The Chinese University of Hong Kong, Hong Kong, China
(3) Jen-tse Huang, The Chinese University of Hong Kong, Hong Kong, China;
(4) Yuxuan Wan, The Chinese University of Hong Kong, Hong Kong, China;
(5) Youliang Yuan, The Chinese University of Hong Kong, Shenzhen Shenzhen, China
(6) Haoyi Qiu University of California, Los Angeles, Los Angeles, USA;
(7) Nanyun Peng, University of California, Los Angeles, Los Angeles, USA
(8) Michael Lyu, The Chinese University of Hong Kong, Hong Kong, China. Table of Links Abstract 1 Introduction 2 Background 3 Approach and Implementation 3.1 Seed Image Collection and 3.2 Neutral Prompt List Collection 3.3 Image Generation and 3.4 Properties Assessment 3.5 Bias Evaluation 4 Evaluation 4.1 Experimental Setup 4.2 RQ1: Effectiveness of BiasPainter 4.3 RQ2 - Validity of Identified Biases 4.4 RQ3 - Bias Mitigation 5 Threats to Validity 6 Related Work 7 Conclusion, Data Availability, and References 4.3 RQ2 - Validity of Identified Biases In this RQ, we investigate whether the biased behaviors exposed by BiasPainter are valid through manual inspection. The vulnerable part of BiasPainter is bias identification, where several AI methods and API are used to evaluate the change in race/gender/age. To ensure that the social biases detected by BiasPainter are truly biased, we perform a manual inspection of the bias identification process. In particular, we recruited two annotators, both have a bachelor’s degree and are proficient in English, to annotate the (seed image, generated image) pairs. For age, we randomly select 10, 10, and 20 (seed image, generated image) pairs that are identified as becoming older (image age bias score > 1), becoming younger (image age bias score < -1), and no significant change on age (0.2 > image age bias score > -0.2), respectively, by BiasPainter. For each pair, annotators are asked a multiple-choice question: A. person 2 is older than person 1; B. person 2 is younger than person 1; C. There is no significant difference between the age of person 2 and person 1. For gender, we randomly select 10, 10, and 20 (seed image, generated image) pairs that are identified as female to male (image gender bias score = -1), male to female (image gender bias score = 1), and no change on race (image gender bias score = 0), respectively, by BiasPainter. For each pair, annotators are asked a multiple-choice question: A. person 1 is male and person 2 is male; B. person 1 is male and person 2 is female; C. person 1 is female and person 2 is male; D. person 1 is female and person 2 is female. For race, we randomly select 10, 10, and 20 (seed image, generated image) pairs that are identified as becoming lighter (image race bias score > 1), becoming darker (image race bias score < -1), and no significant change on skin tone (0.2 > image race bias score > -0.2), respectively, by BiasPainter. For each pair, annotators are asked a multiple-choice question: A. the skin tone of person 2 is lighter than person 1; B. the skin tone of person 2 is darker than person 1; C. There is no significant difference between the skin tone of person 2 and person 1. Annotations are done separately and then they discuss the results and resolve differences to obtain a consensus version of the annotation. By comparing the identification results from BiasPainter with annotated results from the annotators, we calculate the accuracy of BiasPainter. BiasPainter achieves an accuracy of 90.8%, indicating that the bias identification results are reliable. This paper is available on arxiv under CC0 1.0 DEED license. Authors: (1) Wenxuan Wang, The Chinese University of Hong Kong, Hong Kong, China; (2) Haonan Bai, The Chinese University of Hong Kong, Hong Kong, China (3) Jen-tse Huang, The Chinese University of Hong Kong, Hong Kong, China; (4) Yuxuan Wan, The Chinese University of Hong Kong, Hong Kong, China; (5) Youliang Yuan, The Chinese University of Hong Kong, Shenzhen Shenzhen, China (6) Haoyi Qiu University of California, Los Angeles, Los Angeles, USA; (7) Nanyun Peng, University of California, Los Angeles, Los Angeles, USA (8) Michael Lyu, The Chinese University of Hong Kong, Hong Kong, China. Authors: Authors: (1) Wenxuan Wang, The Chinese University of Hong Kong, Hong Kong, China; (2) Haonan Bai, The Chinese University of Hong Kong, Hong Kong, China (3) Jen-tse Huang, The Chinese University of Hong Kong, Hong Kong, China; (4) Yuxuan Wan, The Chinese University of Hong Kong, Hong Kong, China; (5) Youliang Yuan, The Chinese University of Hong Kong, Shenzhen Shenzhen, China (6) Haoyi Qiu University of California, Los Angeles, Los Angeles, USA; (7) Nanyun Peng, University of California, Los Angeles, Los Angeles, USA (8) Michael Lyu, The Chinese University of Hong Kong, Hong Kong, China. Table of Links Abstract Abstract 1 Introduction 1 Introduction 2 Background 2 Background 3 Approach and Implementation 3 Approach and Implementation 3.1 Seed Image Collection and 3.2 Neutral Prompt List Collection 3.1 Seed Image Collection and 3.2 Neutral Prompt List Collection 3.3 Image Generation and 3.4 Properties Assessment 3.3 Image Generation and 3.4 Properties Assessment 3.5 Bias Evaluation 3.5 Bias Evaluation 4 Evaluation 4 Evaluation 4.1 Experimental Setup 4.1 Experimental Setup 4.2 RQ1: Effectiveness of BiasPainter 4.2 RQ1: Effectiveness of BiasPainter 4.3 RQ2 - Validity of Identified Biases 4.3 RQ2 - Validity of Identified Biases 4.4 RQ3 - Bias Mitigation 4.4 RQ3 - Bias Mitigation 5 Threats to Validity 5 Threats to Validity 6 Related Work 6 Related Work 7 Conclusion, Data Availability, and References 7 Conclusion, Data Availability, and References 4.3 RQ2 - Validity of Identified Biases In this RQ, we investigate whether the biased behaviors exposed by BiasPainter are valid through manual inspection. The vulnerable part of BiasPainter is bias identification, where several AI methods and API are used to evaluate the change in race/gender/age. To ensure that the social biases detected by BiasPainter are truly biased, we perform a manual inspection of the bias identification process. In particular, we recruited two annotators, both have a bachelor’s degree and are proficient in English, to annotate the (seed image, generated image) pairs. For age, we randomly select 10, 10, and 20 (seed image, generated image) pairs that are identified as becoming older (image age bias score > 1), becoming younger (image age bias score < -1), and no significant change on age (0.2 > image age bias score > -0.2), respectively, by BiasPainter. For each pair, annotators are asked a multiple-choice question: A. person 2 is older than person 1; B. person 2 is younger than person 1; C. There is no significant difference between the age of person 2 and person 1. For gender, we randomly select 10, 10, and 20 (seed image, generated image) pairs that are identified as female to male (image gender bias score = -1), male to female (image gender bias score = 1), and no change on race (image gender bias score = 0), respectively, by BiasPainter. For each pair, annotators are asked a multiple-choice question: A. person 1 is male and person 2 is male; B. person 1 is male and person 2 is female; C. person 1 is female and person 2 is male; D. person 1 is female and person 2 is female. For race, we randomly select 10, 10, and 20 (seed image, generated image) pairs that are identified as becoming lighter (image race bias score > 1), becoming darker (image race bias score < -1), and no significant change on skin tone (0.2 > image race bias score > -0.2), respectively, by BiasPainter. For each pair, annotators are asked a multiple-choice question: A. the skin tone of person 2 is lighter than person 1; B. the skin tone of person 2 is darker than person 1; C. There is no significant difference between the skin tone of person 2 and person 1. Annotations are done separately and then they discuss the results and resolve differences to obtain a consensus version of the annotation. By comparing the identification results from BiasPainter with annotated results from the annotators, we calculate the accuracy of BiasPainter. BiasPainter achieves an accuracy of 90.8%, indicating that the bias identification results are reliable. This paper is available on arxiv under CC0 1.0 DEED license. This paper is available on arxiv under CC0 1.0 DEED license. available on arxiv

Part of HackerNoon's growing list of open-source research papers, promoting free access to academic material.

Validating BiasPainter: Manual Inspection Confirms High Accuracy in Bias Detection

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

Untitled Story

A Comprehensive Overview of Image Generation Models: From GANs to Diffusion Techniques

Pay attention to that man behind the curtain

AI Is Inherently Neutral - It Is Human Beings Who Are Biased, and the Machines Merely Replicate Them

AI Is Not the Concern - It’s AI Developers You Should Be Worried About

Are Smart Cities a Threat to Data Privacy?

A Comprehensive Overview of Image Generation Models: From GANs to Diffusion Techniques

Pay attention to that man behind the curtain

AI Is Inherently Neutral - It Is Human Beings Who Are Biased, and the Machines Merely Replicate Them

AI Is Not the Concern - It’s AI Developers You Should Be Worried About

Are Smart Cities a Threat to Data Privacy?

Light-Mode

Classic

Newspaper

Dark-Mode

Neon Noir

Minty

HN StartUps