This paper is available on arxiv under CC0 1.0 DEED license.
Authors:
(1) Salim Chouaki, LIX, CNRS, Inria, Ecole Polytechnique, Institut Polytechnique de Paris;
(2) Oana Goga, LIX, CNRS, Inria, Ecole Polytechnique, Institut Polytechnique de Paris;
(3) Hamed Haddadi, Imperial College London, Brave Software;
(4) Peter Snyder, Brave Software.
Our measurement methodology has some limitations. First, we only look for user identifiers transferred in query parameters and do not detect them when they are transferred in other methods. For instance, previous work [33, 39] found that trackers sometimes decorate their own URL in the document.referrer header with user identifiers and reads them on the destination page. Second, we run all our crawling iterations from the same IP address. Consequently, if some query parameters are IP address based, they will have the same value across all iterations, and thus we would not consider them as user identifiers. Finally, our results are subject to variation based on the ads we selected and the search queries we used. Different search queries could potentially trigger distinct ads and lead to diverse advertisers, potentially exhibiting different behaviors. Nonetheless, our primary objective is to demonstrate the potential for third-party tracking when interacting with ads on private search engines.