A Comparative Algorithm Audit of Conspiracies on the Net: Research Questions

This paper is available on arxiv under CC 4.0 license.

Authors:

(1) Aleksandra Urman, She is a corresponding author from Department of Informatics, University of Zurich, Switzerland;

(2) Mykola Makhortykh, Institute of Communication and Media Studies, University of Bern, Switzerland;

(3) Roberto Ulloa, GESIS - Leibniz-Institut für Sozialwissenschaften, Germany;

(4) Juhi Kulshrestha, Department of Politics and Public Administration, University of Konstanz, Germany.

Table of Links

Research questions

Based on the research findings outlined in the previous section, we suggest that SEs can be conducive to the distribution of conspiratorial information and formation of conspiracy beliefs should such information appear in top search results. Unlike social media or online blogs, web search results have not been examined in the context of the distribution of conspiratorial information - despite numerous analyses showing that biased or not fully accurate results are not uncommon there. With this paper, we strive to partially fill the identified gaps in both - research focused on the presence of inaccurate information in web search and research on the spread of conspiracy theories online. Hence, our first research question concerns the general prevalence of conspiratorial information in top search results.

RQ1: How prevalent is conspiratorial information in web search results?

Because there are major discrepancies in the information provided by different SEs (e.g., Kravets and Toepfl, 2021; Makhortykh et al., 2020; Paramita et al., 2021), and in the results retrieved from different locations (Kliman-Silver et al., 2015), we find it worthwhile to address RQ1 from a comparative perspective to increase the robustness of the findings. Specifically, we compare the observations across different locations and on different search engines using the data collected at two different points in time. Further, we select two groups of conspiracy-related search queries, and expect to observe differences between the results retrieved for them, adding another comparative dimension to the analysis. Thus, we pose three sub-RQs in relation to RQ1:

RQ1a: Does the prevalence of conspiratorial information in search results vary across different locations and time periods?

RQ1b: Does the prevalence of conspiratorial information in search results vary across SEs, and how?

RQ1c: Does the prevalence of conspiratorial information in search results vary across search queries, and how?

Research, on one hand, demonstrates that SEs tend to over-represent certain source types while under-representing others (e.g., Puschmann, 2019; Haim et al., 2018), and, on another hand, suggests that conspiracy theories are mainly propagated on social media or small conspiracy-related websites (Douglas et al., 2019; Stano, 2020). Hence, we juxtapose the presence of conspiracies in web search with the prioritization of different sources and pose the following RQ:

RQ2: What types of sources are prioritized by web SEs in relation to conspiracy theories?

In the case of RQ2, we will also adopt a comparative approach similar to that of RQ1 - comparing the results for different engines, locations, waves and queries:

RQ2a: Does the prevalence of sources of different types in search results vary across different locations and time periods?

RQ2b: Does the prevalence of sources of different types in search results vary across SEs, and how?

RQ2c: Does the prevalence of sources of different types in search results vary across search queries, and how?

Additionally, we examine the relationship between the source type and the stance towards conspiracy theories. To do it, we formulate the last research question:

RQ3: Are there differences in the share of conspiracy-promoting or conspiracy-debunking content coming from sources of different types?