paint-brush
GDPR and Non-PII Data: Addressing Privacy Vulnerabilities in Nanotargetingby@netizenship
191 reads

GDPR and Non-PII Data: Addressing Privacy Vulnerabilities in Nanotargeting

by Netizenship Meaning in Online Communities
Netizenship Meaning in Online Communities HackerNoon profile picture

Netizenship Meaning in Online Communities

@netizenship

Netizenship is internet citizenship. We publish academic research on digital...

May 30th, 2024
Read on Terminal Reader
Read this story in a terminal
Print this story
Read this story w/o Javascript
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

The dataset primarily consists of users from the US, potentially introducing biases. Despite this, proof of concept ad campaigns targeting users in different countries align with model predictions, suggesting minimal bias. Various scenarios validate the model, and LinkedIn ad reports support accurate data tracking.
featured image - GDPR and Non-PII Data: Addressing Privacy Vulnerabilities in Nanotargeting
1x
Read by Dr. One voice-avatar

Listen to this story

Netizenship Meaning in Online Communities HackerNoon profile picture
Netizenship Meaning in Online Communities

Netizenship Meaning in Online Communities

@netizenship

Netizenship is internet citizenship. We publish academic research on digital rights of online community members.

Learn More
LEARN MORE ABOUT @NETIZENSHIP'S
EXPERTISE AND PLACE ON THE INTERNET.
0-item

STORY’S CREDIBILITY

Academic Research Paper

Academic Research Paper

Part of HackerNoon's growing list of open-source research papers, promoting free access to academic material.

Authors:

(1) Ángel Merino, Department of Telematic Engineering Universidad Carlos III de Madrid {angel.merino@uc3m.es};

(2) José González-Cabañas, UC3M-Santander Big Data Institute {jose.gonzalez.cabanas@uc3m.es}

(3) Ángel Cuevas, Department of Telematic Engineering Universidad Carlos III de Madrid & UC3M-Santander Big Data Institute {acrumin@it.uc3m.es};

(4) Rubén Cuevas, Department of Telematic Engineering Universidad Carlos III de Madrid & UC3M-Santander Big Data Institute {rcuevas@it.uc3m.es}.

Abstract and Introduction

LinkedIn Advertising Platform Background

Dataset

Methodology

User’s Uniqueness on LinkedIn

Nanotargeting proof of concept

Discussion

Related work

Ethics and legal considerations

Conclusions, Acknowledgments, and References

Appendix


A Country distribution of users in our data sample

Our dataset contains samples from 58 different countries, but a few countries constitute most of our dataset, especially the United States. Table 4 shows the breakdown of the number of users per country in our dataset. About 75% of the users in our dataset are from the United States.


We acknowledge the fact that this circumstance in our data may lead to some biases in the results of our model and, therefore, to the estimation of N. However, the fact that the proof of concept experiment was targeting users in a different country than the US and the obtained results are aligned with the model outcome makes us confident that the potential bias (if any) may be slight.


As discussed in the paper, our intuition is that our model serves as an upper bound for the number of skills needed to make a user unique. This may be because the United States is one of the countries including more LinkedIn users. Therefore, it seems reasonable to estimate that, in many cases, it will be easier to re-identify users reporting a different location than the US.


B Model fitting in the other considered scenarios

We apply our methodology to four different scenarios summarized in section 4.3. This appendix shows the model fitting for the other three scenarios not shown in the paper body referred to as (ii) Sk_LP (Figure 7), we only use the least popular professional skills; (iii) Lo_R (Figure 8), we use the location and professional skills selected at random; (iv) Lo_LP (Figure 9), we use the location and the least popular skills. Table 1 shows the R2 values for all the line fittings.



Table 4: Distribution of users in our dataset per country.

Table 4: Distribution of users in our dataset per country.


C LinkedIn dashboard report for the proof of concept experiment ad campaigns

Some of the results reported in Table 3 were extracted from the LinkedIn Campaign Manager report for the different advertising campaigns delivered. Figure 10 shows a snapshot of the information reported by LinkedIn for the 15 ad campaigns executed in our proof of concept experiment. It shows the ID of the campaign, the budget spent, the number of delivered ad impressions, and the number of clicks received.


While we acknowledge that the report may not be 100% accurate, the fact that impressions counted by the targeted individuals in our experiment always match the number of impressions reported by LinkedIn leads us to think that



Figure 7: Application of the methodology to the Sk_LP scenario for V AS(Q) with Q = 50, 75 and 90. The figure visually depicts the model fitting (lines) to the data obtained from our dataset (markers). It also shows the audience size asymptote in 300 and a bold line where the audience size has a value equal to 1.

Figure 7: Application of the methodology to the Sk_LP scenario for V AS(Q) with Q = 50, 75 and 90. The figure visually depicts the model fitting (lines) to the data obtained from our dataset (markers). It also shows the audience size asymptote in 300 and a bold line where the audience size has a value equal to 1.




Figure 8: Application of the methodology to the Lo_R scenario for V AS(Q) with Q = 50, 75 and 90. The figure visually depicts the model fitting (lines) to the data obtained from our dataset (markers). It also shows the audience size asymptote in 300 and a bold line where the audience size has a value equal to 1.

Figure 8: Application of the methodology to the Lo_R scenario for V AS(Q) with Q = 50, 75 and 90. The figure visually depicts the model fitting (lines) to the data obtained from our dataset (markers). It also shows the audience size asymptote in 300 and a bold line where the audience size has a value equal to 1.




Figure 9: Application of the methodology to the Lo_LP scenario for V AS(Q) with Q = 50, 75 and 90. The figure visually depicts the model fitting (lines) to the data obtained from our dataset (markers). It also shows the audience size asymptote in 300 and a bold line where the audience size has a value equal to 1.

Figure 9: Application of the methodology to the Lo_LP scenario for V AS(Q) with Q = 50, 75 and 90. The figure visually depicts the model fitting (lines) to the data obtained from our dataset (markers). It also shows the audience size asymptote in 300 and a bold line where the audience size has a value equal to 1.



not only that the targeted users the only ones have received their corresponding ad, but also that the LinkedIn count is accurate, at least for audiences close to 1 user.



Figure 10: Screenshot of the LinkedIn Campaign Manager dashboard, including the results of the 15 ad campaigns executed in our proof of concept experiment.

Figure 10: Screenshot of the LinkedIn Campaign Manager dashboard, including the results of the 15 ad campaigns executed in our proof of concept experiment.


This paper is available on arxiv under CC BY-NC-ND 4.0 DEED license.


L O A D I N G
. . . comments & more!

About Author

Netizenship Meaning in Online Communities HackerNoon profile picture
Netizenship Meaning in Online Communities@netizenship
Netizenship is internet citizenship. We publish academic research on digital rights of online community members.

TOPICS

THIS ARTICLE WAS FEATURED IN...

Permanent on Arweave
Read on Terminal Reader
Read this story in a terminal
 Terminal
Read this story w/o Javascript
Read this story w/o Javascript
 Lite
Also published here
Thetechstreetnow
X REMOVE AD