Authors:
(1) TIMNIT GEBRU, Black in AI;
(2) JAMIE MORGENSTERN, University of Washington;
(3) BRIANA VECCHIONE, Cornell University;
(4) JENNIFER WORTMAN VAUGHAN, Microsoft Research;
(5) HANNA WALLACH, Microsoft Research;
(6) HAL DAUMÉ III, Microsoft Research; University of Maryland;
(7) KATE CRAWFORD, Microsoft Research. Table of Links 1 Introduction 1.1 Objectives 2 Development Process 3 Questions and Workflow 3.1 Motivation 3.2 Composition 3.3 Collection Process 3.4 Preprocessing/cleaning/labeling 3.5 Uses 3.6 Distribution 3.7 Maintenance 4 Impact and Challenges Acknowledgments and References Appendix Acknowledgments We thank Peter Bailey, Emily Bender, Yoshua Bengio, Sarah Bird, Sarah Brown, Steven Bowles, Joy Buolamwini, Amanda Casari, Eric Charran, Alain Couillault, Lukas Dauterman, Leigh Dodds, Miroslav Dudík, Michael Ekstrand, Noémie Elhadad, Michael Golebiewski, Nick Gonsalves, Martin Hansen, Andy Hickl, Michael Hoffman, Scott Hoogerwerf, Eric Horvitz, Mingjing Huang, Surya Kallumadi, Ece Kamar, Krishnaram Kenthapadi, Emre Kiciman, Jacquelyn Krones, Erik Learned-Miller, Lillian Lee, Jochen Leidner, Rob Mauceri, Brian Mcfee, Emily McReynolds, Bogdan Micu, Margaret Mitchell, Sangeeta Mudnal, Brendan O’Connor, Thomas Padilla, Bo Pang, Anjali Parikh, Lisa Peets, Alessandro Perina, Michael Philips, Barton Place, Sudha Rao, Jen Ren, David Van Riper, Anna Roth, Cynthia Rudin, Ben Shneiderman, Biplav Srivastava, Ankur Teredesai, Rachel Thomas, Martin Tomko, Panagiotis Tziachris, Meredith Whittaker, Hans Wolters, Ashly Yeo, Lu Zhang, and the attendees of the Partnership on AI’s April 2019 ABOUT ML workshop for valuable feedback. References [1] Don A Andrews, James Bonta, and J Stephen Wormith. 2006. The recent past and near future of risk and/or need assessment. Crime & Delinquency 52, 1 (2006), 7–27. [2] Emily M. Bender and Batya Friedman. 2018. Data Statements for Natural Language Processing: Toward Mitigating System Bias and Enabling Better Science. Transactions of the Association for Computational Linguistics 6 (2018), 587–604. [3] Anant P. Bhardwaj, Souvik Bhattacherjee, Amit Chavan, Amol Deshpande, Aaron J. Elmore, Samuel Madden, and Aditya G. Parameswaran. 2014. DataHub: Collaborative Data Science & Dataset Version Management at Scale. CoRR abs/1409.0798 (2014). [4] Tolga Bolukbasi, Kai-Wei Chang, James Y Zou, Venkatesh Saligrama, and Adam T Kalai. 2016. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings. In Advances in Neural Information Processing Systems (NeurIPS). [5] Joy Buolamwini and Timnit Gebru. 2018. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification. In Proceedings of the Conference on Fairness, Accountability, and Transparency (FAT*). 77–91. [6] Yang Trista Cao and Hal Daumé. 2020. Toward Gender-Inclusive Coreference Resolution. In Proceedings of the Conference of the Association for Computational Linguistics (ACL). abs/1910.13913. [7] Yang Trista Cao and Hal Daumé, III. 2020. Toward Gender-Inclusive Coreference Resolution. In Proceedings of the Conference of the Association for Computational Linguistics (ACL). [8] James Cheney, Laura Chiticariu, and Wang-Chiew Tan. 2009. Provenance in databases: Why, how, and where. Foundations and Trends in Databases 1, 4 (2009), 379–474. [9] Kasia Chmielinski, Sarah Newman, Matt Taylor, Josh Joseph, Kemi Thomas, Jessica Yurkofsky, and Yue Chelsea Qiu. 2020. The Dataset Nutrition Label (2nd Gen): Leveraging Context to Mitigate Harms in Artificial Intelligence. In NeurIPS Workshop on Dataset Curation and Security. [10] Eunsol Choi, He He, Mohit Iyyer, Mark Yatskar, Wen-tau Yih, Yejin Choi, Percy Liang, and Luke Zettlemoyer. 2018. QuAC : Question Answering in Context. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. [11] Glennda Chui. 2017. Project will use AI to prevent or minimize electric grid failures. [Online; accessed 14-March-2018]. [12] Jeffrey Dastin. 2018. Amazon scraps secret AI recruiting tool that showed bias against women. https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/amazonscraps-secret-ai-recruiting-tool-that-showed-bias-against-women-idUSKCN1MK08G [13] Clare Garvie, Alvaro Bedoya, and Jonathan Frankle. 2016. The Perpetual Line-Up: Unregulated Police Face Recognition in America. Georgetown Law, Center on Privacy & Technology, New Jersey Ave NW, Washington, DC. [14] Michael Hind, Sameep Mehta, Aleksandra Mojsilovic, Ravi Nair, Karthikeyan Natesan Ramamurthy, Alexandra Olteanu, and Kush R. Varshney. 2018. Increasing Trust in AI Services through Supplier’s Declarations of Conformity. CoRR abs/1808.07261 (2018). [15] Kenneth Holstein, Jennifer Wortman Vaughan, Hal Daumé III, Miroslav Dudík, and Hanna M. Wallach. 2019. Improving Fairness in Machine Learning Systems: What Do Industry Practitioners Need?. In 2019 ACM CHI Conference on Human Factors in Computing Systems. [16] Gary B Huang, Manu Ramesh, Tamara Berg, and Erik Learned-Miller. 2007. Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical Report 07-49. University of Massachusetts Amherst. [17] Ivan Krasin, Tom Duerig, Neil Alldrin, Vittorio Ferrari, Sami Abu-El-Haija, Alina Kuznetsova, Hassan Rom, Jasper Uijlings, Stefan Popov, Shahab Kamali, Matteo Malloci, Jordi Pont-Tuset, Andreas Veit, Serge Belongie, Victor Gomes, Abhinav Gupta, Chen Sun, Gal Chechik, David Cai, Zheyun Feng, Dhyanesh Narayanan, and Kevin Murphy. 2017. OpenImages: A public dataset for large-scale multi-label and multi-class image classification. [18] Tom CW Lin. 2012. The new investor. UCLA Law Review 60 (2012), 678. [19] G Mann and C O’Neil. 2016. Hiring Algorithms Are Not Neutral. https://hbr.org/2016/12/hiring-algorithms-are-not-neutral. [20] Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. 2019. Model Cards for Model Reporting. In Proceedings of the Conference on Fairness, Accountability, and Transparency (FAT*). 220–229. [21] Mary Catherine O’Connor. 2017. How AI Could Smarten Up Our Water System. [Online; accessed 14-March-2018]. [22] Bo Pang and Lillian Lee. 2004. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics. 271. [23] Ismaïla Seck, Khouloud Dahmane, Pierre Duthon, and Gaëlle Loosli. 2018. Baselines and a datasheet for the Cerema AWP dataset. CoRR abs/1806.04016 (2018). http://arxiv.org/abs/1806.04016 [24] Doha Suppy Systems. 2017. Facial Recognition. [Online; accessed 14-March-2018]. [25] World Economic Forum Global Future Council on Human Rights 2016–2018. 2018. How to Prevent Discriminatory Outcomes in Machine Learning. https://www.weforum.org/whitepapers/how-to-prevent-discriminatory-outcomes-inmachine-learning. [26] Semih Yagcioglu, Aykut Erdem, Erkut Erdem, and Nazli Ikizler-Cinbis. 2018. RecipeQA: A Challenge Dataset for Multimodal Comprehension of Cooking Recipes. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing This paper is available on arxiv under CC 4.0 license. Authors: (1) TIMNIT GEBRU, Black in AI; (2) JAMIE MORGENSTERN, University of Washington; (3) BRIANA VECCHIONE, Cornell University; (4) JENNIFER WORTMAN VAUGHAN, Microsoft Research; (5) HANNA WALLACH, Microsoft Research; (6) HAL DAUMÉ III, Microsoft Research; University of Maryland; (7) KATE CRAWFORD, Microsoft Research. Authors: Authors: (1) TIMNIT GEBRU, Black in AI; (2) JAMIE MORGENSTERN, University of Washington; (3) BRIANA VECCHIONE, Cornell University; (4) JENNIFER WORTMAN VAUGHAN, Microsoft Research; (5) HANNA WALLACH, Microsoft Research; (6) HAL DAUMÉ III, Microsoft Research; University of Maryland; (7) KATE CRAWFORD, Microsoft Research. Table of Links 1 Introduction 1 Introduction 1.1 Objectives 1.1 Objectives 2 Development Process 2 Development Process 3 Questions and Workflow 3 Questions and Workflow 3.1 Motivation 3.1 Motivation 3.2 Composition 3.2 Composition 3.3 Collection Process 3.3 Collection Process 3.4 Preprocessing/cleaning/labeling 3.4 Preprocessing/cleaning/labeling 3.5 Uses 3.5 Uses 3.6 Distribution 3.6 Distribution 3.7 Maintenance 3.7 Maintenance 4 Impact and Challenges 4 Impact and Challenges Acknowledgments and References Acknowledgments and References Appendix Appendix Acknowledgments We thank Peter Bailey, Emily Bender, Yoshua Bengio, Sarah Bird, Sarah Brown, Steven Bowles, Joy Buolamwini, Amanda Casari, Eric Charran, Alain Couillault, Lukas Dauterman, Leigh Dodds, Miroslav Dudík, Michael Ekstrand, Noémie Elhadad, Michael Golebiewski, Nick Gonsalves, Martin Hansen, Andy Hickl, Michael Hoffman, Scott Hoogerwerf, Eric Horvitz, Mingjing Huang, Surya Kallumadi, Ece Kamar, Krishnaram Kenthapadi, Emre Kiciman, Jacquelyn Krones, Erik Learned-Miller, Lillian Lee, Jochen Leidner, Rob Mauceri, Brian Mcfee, Emily McReynolds, Bogdan Micu, Margaret Mitchell, Sangeeta Mudnal, Brendan O’Connor, Thomas Padilla, Bo Pang, Anjali Parikh, Lisa Peets, Alessandro Perina, Michael Philips, Barton Place, Sudha Rao, Jen Ren, David Van Riper, Anna Roth, Cynthia Rudin, Ben Shneiderman, Biplav Srivastava, Ankur Teredesai, Rachel Thomas, Martin Tomko, Panagiotis Tziachris, Meredith Whittaker, Hans Wolters, Ashly Yeo, Lu Zhang, and the attendees of the Partnership on AI’s April 2019 ABOUT ML workshop for valuable feedback. References [1] Don A Andrews, James Bonta, and J Stephen Wormith. 2006. The recent past and near future of risk and/or need assessment. Crime & Delinquency 52, 1 (2006), 7–27. [2] Emily M. Bender and Batya Friedman. 2018. Data Statements for Natural Language Processing: Toward Mitigating System Bias and Enabling Better Science. Transactions of the Association for Computational Linguistics 6 (2018), 587–604. [3] Anant P. Bhardwaj, Souvik Bhattacherjee, Amit Chavan, Amol Deshpande, Aaron J. Elmore, Samuel Madden, and Aditya G. Parameswaran. 2014. DataHub: Collaborative Data Science & Dataset Version Management at Scale. CoRR abs/1409.0798 (2014). [4] Tolga Bolukbasi, Kai-Wei Chang, James Y Zou, Venkatesh Saligrama, and Adam T Kalai. 2016. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings. In Advances in Neural Information Processing Systems (NeurIPS). [5] Joy Buolamwini and Timnit Gebru. 2018. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification. In Proceedings of the Conference on Fairness, Accountability, and Transparency (FAT*). 77–91. [6] Yang Trista Cao and Hal Daumé. 2020. Toward Gender-Inclusive Coreference Resolution. In Proceedings of the Conference of the Association for Computational Linguistics (ACL). abs/1910.13913. [7] Yang Trista Cao and Hal Daumé, III. 2020. Toward Gender-Inclusive Coreference Resolution. In Proceedings of the Conference of the Association for Computational Linguistics (ACL). [8] James Cheney, Laura Chiticariu, and Wang-Chiew Tan. 2009. Provenance in databases: Why, how, and where. Foundations and Trends in Databases 1, 4 (2009), 379–474. [9] Kasia Chmielinski, Sarah Newman, Matt Taylor, Josh Joseph, Kemi Thomas, Jessica Yurkofsky, and Yue Chelsea Qiu. 2020. The Dataset Nutrition Label (2nd Gen): Leveraging Context to Mitigate Harms in Artificial Intelligence. In NeurIPS Workshop on Dataset Curation and Security. [10] Eunsol Choi, He He, Mohit Iyyer, Mark Yatskar, Wen-tau Yih, Yejin Choi, Percy Liang, and Luke Zettlemoyer. 2018. QuAC : Question Answering in Context. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. [11] Glennda Chui. 2017. Project will use AI to prevent or minimize electric grid failures. [Online; accessed 14-March-2018]. [12] Jeffrey Dastin. 2018. Amazon scraps secret AI recruiting tool that showed bias against women. https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/amazonscraps-secret-ai-recruiting-tool-that-showed-bias-against-women-idUSKCN1MK08G [13] Clare Garvie, Alvaro Bedoya, and Jonathan Frankle. 2016. The Perpetual Line-Up: Unregulated Police Face Recognition in America. Georgetown Law, Center on Privacy & Technology, New Jersey Ave NW, Washington, DC. [14] Michael Hind, Sameep Mehta, Aleksandra Mojsilovic, Ravi Nair, Karthikeyan Natesan Ramamurthy, Alexandra Olteanu, and Kush R. Varshney. 2018. Increasing Trust in AI Services through Supplier’s Declarations of Conformity. CoRR abs/1808.07261 (2018). [15] Kenneth Holstein, Jennifer Wortman Vaughan, Hal Daumé III, Miroslav Dudík, and Hanna M. Wallach. 2019. Improving Fairness in Machine Learning Systems: What Do Industry Practitioners Need?. In 2019 ACM CHI Conference on Human Factors in Computing Systems. [16] Gary B Huang, Manu Ramesh, Tamara Berg, and Erik Learned-Miller. 2007. Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical Report 07-49. University of Massachusetts Amherst. [17] Ivan Krasin, Tom Duerig, Neil Alldrin, Vittorio Ferrari, Sami Abu-El-Haija, Alina Kuznetsova, Hassan Rom, Jasper Uijlings, Stefan Popov, Shahab Kamali, Matteo Malloci, Jordi Pont-Tuset, Andreas Veit, Serge Belongie, Victor Gomes, Abhinav Gupta, Chen Sun, Gal Chechik, David Cai, Zheyun Feng, Dhyanesh Narayanan, and Kevin Murphy. 2017. OpenImages: A public dataset for large-scale multi-label and multi-class image classification. [18] Tom CW Lin. 2012. The new investor. UCLA Law Review 60 (2012), 678. [19] G Mann and C O’Neil. 2016. Hiring Algorithms Are Not Neutral. https://hbr.org/2016/12/hiring-algorithms-are-not-neutral. [20] Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. 2019. Model Cards for Model Reporting. In Proceedings of the Conference on Fairness, Accountability, and Transparency (FAT*). 220–229. [21] Mary Catherine O’Connor. 2017. How AI Could Smarten Up Our Water System. [Online; accessed 14-March-2018]. [22] Bo Pang and Lillian Lee. 2004. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics. 271. [23] Ismaïla Seck, Khouloud Dahmane, Pierre Duthon, and Gaëlle Loosli. 2018. Baselines and a datasheet for the Cerema AWP dataset. CoRR abs/1806.04016 (2018). http://arxiv.org/abs/1806.04016 [24] Doha Suppy Systems. 2017. Facial Recognition. [Online; accessed 14-March-2018]. [25] World Economic Forum Global Future Council on Human Rights 2016–2018. 2018. How to Prevent Discriminatory Outcomes in Machine Learning. https://www.weforum.org/whitepapers/how-to-prevent-discriminatory-outcomes-inmachine-learning. [26] Semih Yagcioglu, Aykut Erdem, Erkut Erdem, and Nazli Ikizler-Cinbis. 2018. RecipeQA: A Challenge Dataset for Multimodal Comprehension of Cooking Recipes. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing This paper is available on arxiv under CC 4.0 license. This paper is available on arxiv under CC 4.0 license. available on arxiv

Part of HackerNoon's growing list of open-source research papers, promoting free access to academic material.

Using Language Models to Simulate Human Samples: Acknowledgments and References

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

102 Languages, One Model: The Multimodal AI Breakthrough You Need to Know

Out of One, Many: Using Language Models to Simulate Human Samples

Refining Dataset Documentation: A Two-Year Journey to Improve AI Data Transparency

How to Create Detailed Datasheets for AI Datasets

The Why and How of Dataset Creation

Understanding Dataset Instances and Relationships

102 Languages, One Model: The Multimodal AI Breakthrough You Need to Know

Out of One, Many: Using Language Models to Simulate Human Samples

Refining Dataset Documentation: A Two-Year Journey to Improve AI Data Transparency

How to Create Detailed Datasheets for AI Datasets

The Why and How of Dataset Creation

Understanding Dataset Instances and Relationships

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps