paint-brush
Using Language Models to Simulate Human Samples: Appendixby@textmodels

Using Language Models to Simulate Human Samples: Appendix

tldt arrow

Too Long; Didn't Read

Datasheets for datasets have gained traction across academic and industry settings, fostering transparency and accountability. While implementation challenges exist, the benefits of improved communication and accountability outweigh the costs, driving adoption and evolution in dataset creation practices.
featured image - Using Language Models to Simulate Human Samples: Appendix
Writings, Papers and Blogs on Text Models HackerNoon profile picture

Authors:

(1) TIMNIT GEBRU, Black in AI;

(2) JAMIE MORGENSTERN, University of Washington;

(3) BRIANA VECCHIONE, Cornell University;

(4) JENNIFER WORTMAN VAUGHAN, Microsoft Research;

(5) HANNA WALLACH, Microsoft Research;

(6) HAL DAUMÉ III, Microsoft Research; University of Maryland;

(7) KATE CRAWFORD, Microsoft Research.

1 Introduction

1.1 Objectives

2 Development Process

3 Questions and Workflow

3.1 Motivation

3.2 Composition

3.3 Collection Process

3.4 Preprocessing/cleaning/labeling

3.5 Uses

3.6 Distribution

3.7 Maintenance

4 Impact and Challenges

Acknowledgments and References

Appendix

A Appendix

In this appendix, we provide an example datasheet for Pang and Lee’s polarity dataset [22] (figure 1 to figure 4).


Fig. 1. Example datasheet for Pang and Lee’s polarity dataset [22], page 1.


Fig. 2. Example datasheet for Pang and Lee’s polarity dataset [22], page 2.


Fig. 3. Example datasheet for Pang and Lee’s polarity dataset [22], page 3.


Fig. 4. Example datasheet for Pang and Lee’s polarity dataset [22], page 4.


This paper is available on arxiv under CC 4.0 license.