Too Long; Didn't Read
We will aggregate categorical responses with the help of two classical algorithms – Majority Vote and Dawid-Skene. Crowd-Kit is designed to work with Python data science libraries like NumPy, SciPy, and Pandas. We’ll be using Toloka Aggregation Relevance datasets with two categories: relevant and not relevant. The data frame, or df, has three columns: performer, task, and label. The label is set to 0 if document is rated as non-relevant by the given performer in the given task, otherwise the label will be 1.