NewsUnfold Tests a Crowdsourced Approach to Detecting Media Bias

Table of Links

4 The NewsUnfold Platform

Tailored toward news readers, NewsUnfold highlights potentially biased sentences in articles ( in Figure 4) and incorporates the Highlights feedback module ( in Figure 4) assessed in Section 3 to create a comprehensive, costeffective, crowdsourced dataset through reader-feedback. The feedback mechanism additionally includes a free-text field ( in Figure 4) where readers can justify their feedback.

Application Design

NewsUnfold’s responsive design draws inspiration from news aggregation platforms,[10] aiming to represent an environment where users, given updating content, return to regularly. By clarifying the purpose of our research, the societal importance of media bias, and giving access to automated bias classification, we encourage voluntary feedback contributions.

The landing page states NewsUnfold’s mission: encouraging bias-aware reading and collecting feedback to refine bias detection to mitigate its negative effects. To further motivate contributions, it emphasizes the value of individual users’ feedback in enhancing bias-detection capabilities. Clicking a call-to-action button guides users to the Article Overview Page (Figure 14). As a preliminary stage, this page displays 12 static articles spanning nine subjects, balanced by the bias amount and political orientation. Different articles enable readers to compare the amount of bias in one article. Selecting an article directs users to NewsUnfold’s Article Reading Page, which integrates the bias highlights and feedback mechanism. Table 2 outlines its essential components. The sparkles highlight controversial sentences or sentences that received the least feedback to enable balanced feedback collection ( in Figure 4). From the Article Overview Page (Figure 14), users can additionally initiate a tutorial ( in Figure 14) guiding them through the bias highlights ( in Figure 4), the feedback mechanism ( in Figure 4), and concluding with a pointer to the UX survey ( in Figure 16). After each article, we show three recommended articles ( in Figure 16).

Study Design

1. Engagement: Measure the amount of voluntary feed- back from readers without monetary incentives.

Data Quality: Assessing quality of feedback.

3. Classifier: Investigating classifier performance when integrating feedback-generated labels.

User Experience: Evaluating user experience and perception of NewsUnfold, focusing on bias highlights (1 in Figure 4) and feedback (2 in Figure 4) for a user-centered design approach.

During the study, readers can freely explore the platform, select articles, decide to provide anonymous feedback, and choose to participate in the UX survey. Unlike the preliminary study, participants are not sourced from crowdworking platforms but reached via LinkedIn, Instagram, and university boards. The outreach briefly introduces NewsUnfold with a link to its landing page. Readers are informed of feed- back data collection beforehand.

To understand the readers’ experiences, a voluntary UX survey (5 in Figure 16) is available after reading an article.11 In this study, we prioritize identifying UX issues among readers to boost participation and feedback efficiency, focusing on UX-oriented data collection over comprehensive quantitative analysis. To obtain user analytics, we use Umami[12], a privacy-centric tool logging the number of clicks, unique visitors, country, language settings, device types, most-visited pages, and the number of tutorial initiations while keeping the anonymity of visitors.

system with a minimum of five votes per sentence to estab- lish sentence labels. The labels are stored in the same structure as BABE (Spinde et al. 2021b) to enable the merging of the two datasets. We apply a spam detection method by Raykar and Yu (2011) to filter out unreliable annotations. We calculate a score between 0 and 1 for each annotator and eliminate annotators in the 0.05th percentile. We assess the quality of the resulting dataset, similar to Section 3, using the IAA metric Krippendorff’s α and manual analysis.

As HITL systems center around iteratively improving machine performance through user input, we evaluate the integration of feedback data into classifier training. The training process adopts hyperparameter configurations from Spinde et al. (2021b) with a pre-trained model from Hugging Face.[13] We train and evaluate the model with data from NUDA added to the 3700 BABE sentences and compare it against the baseline classifier (Spinde et al. 2021b) using the F1-Score (Powers 2008).

Authors:

(1) Smi Hinterreiter;

(2) Martin Wessel;

(3) Fabian Schliski;

(4) Isao Echizen;

(5) Marc Erich Latoschik;

(6) Timo Spinde.

This paper is available on arxiv under CC0 1.0 license.

[10] E.g., Google News (https://news.google.com).

[11] The survey consists of 9 questions: two scales and eight optional open-ended queries. Appendix B contains a detailed break- down of the survey and its results.

[12] https://umami.is

[13] https://huggingface.co/mediabiasgroup/DA-RoBERTa-BABE