paint-brush
Creating a Dataset Sucks. Here's What I've Learned to Make it a Little Bit Easierby@calmdownkarm
211 reads

Creating a Dataset Sucks. Here's What I've Learned to Make it a Little Bit Easier

by Karmanya Aggarwal4mAugust 16th, 2020
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Creating datasets is hard, even for relatively simple tasks like document classification/sentiment analysis. Even a relatively high agreement score can allow for 10% of your data to be wrong. Hiring a set of annotators is hard too, but once you figure out the hiring, here's some things that I've learned managing annotators. Getting a minimal set of data annotated, and trying to build models on it will help target the specific kinds of annotated data that you need, or even bad classifiers can help filter through a lot of data.

Companies Mentioned

Mention Thumbnail
Mention Thumbnail
featured image - Creating a Dataset Sucks. Here's What I've Learned to Make it a Little Bit Easier
Karmanya Aggarwal HackerNoon profile picture
Karmanya Aggarwal

Karmanya Aggarwal

@calmdownkarm

L O A D I N G
. . . comments & more!

About Author

Karmanya Aggarwal HackerNoon profile picture
Karmanya Aggarwal@calmdownkarm

TOPICS

THIS ARTICLE WAS FEATURED IN...

Permanent on Arweave
Read on Terminal Reader
Read this story in a terminal
 Terminal
Read this story w/o Javascript
Read this story w/o Javascript
 Lite
Learnrepo