paint-brush
Linguistic Analysis Reveals Authorship Changes and Collaboration in QDrop Postsby@ethnology

Linguistic Analysis Reveals Authorship Changes and Collaboration in QDrop Posts

by EthnologyDecember 7th, 2024
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

If the author of Qdrops is among our candidates, the results here seem to demonstrate the major role of Ron W. in the writing of the Qdrops, at least since the switch to 8chan. The peak of Paul F. in one of the two analyses could very well be revealing of a real participation, even a leading role for all the period before 8chan, with perhaps afterwards a brief period of collaboration (or competition)
featured image - Linguistic Analysis Reveals Authorship Changes and Collaboration in QDrop Posts
Ethnology HackerNoon profile picture

Authors:

(1) Florian Cafiero (ORCID 0000-0002-1951-6942), Sciences Po, Medialab;

(2) Jean-Baptiste Camps (ORCID 0000-0003-0385-7037), Ecole nationale des chartes, Universite Paris, Sciences & Lettres.

Abstract and Introduction

Why work on QAnon? Specificities and social impact

Who is Q? The theories put to test

Authorship attribution

Results

Discussion

Corpus constitution

Quotes of authors outside of the corpus have been

Definition of two subcorpus: dealing with generic difference and an imbalanced dataset

The genre of “Q drops”: a methodological challenge

Detecting style changes: rolling stylometry

Ethical statement, Acknowledgements, and References

Discussion

If the author of the Qdrops is among our candidates, the results here seem to demonstrate the major role of Ron W. in the writing of the Qdrops, at least since the switch to 8chan. The peak of Paul F. in one of the two analyses, for the period before 8chan, could very well be revealing of a real participation, even a leading role for all the period before 8chan, with perhaps afterwards a brief period of collaboration (or competition), in between the migration to 8chan and what Paul F. himself describes as the last authentic Qdrop.



Figure 2: 10 largest coefficient (negative and positive) for the Liner SVC classifiers of Christina U., Donald T., Michael F., Paul F. and Ron W., trained on the larger (left) and smaller (right) corpora


Localised peaks of Christina U. or Michael F. on the other hand, while they might very hypothetically be revealing of more occasional collaborations, should probably not be over-interpreted. Given the nature of the coefficients used by the model for them (fig. 2), they seem more likely to be caused by ‘topic similarities’ due to the news and topics dealt with in the Qdrops. This could result in the choice of a similar lexicon, and even in quotations or paraphrases. Confusions between Michael F. or Christina U. on one hand, and other candidates (in training) or Q, seem due to attractions in terms of language register and generic peculiarities: samples that use a more elevated and grandiloquent type of patriotic address are brought somewhat closer to Michael F. samples, while those including heavier news-related content might be drawn towards Christina U.


A more exploratory visualization of the feature space, based on correspondence analysis, shows a strong opposition (first dimension) between PaulF private posts and public writings, while the second one oppose PaulF and RonW samples (fig. 3). Projected in this feature space, the Qdrops from the 8Chan period, after the ‘board compromised’ post, mostly appear inside RonW data cloud, while previous posts, especially from the time of 4Chan are situated substantially closer to PaulF writings.


This paper of course has limitations. The very nature of the Q drops, a genre in itself, and the brevity of these texts, make difficult to render a finer grain picture than the one we present here. It is thus plausible, even if we cannot demonstrate it, that other punctual interventions could have occurred. The training data we collected is important, and sufficient to obtain excellent performances. Yet, more training data could of course help increase the precision and reliability of the analyses. A media outlet for instance collected the Facebook posts written by one of our candidate author, for which more support could have been helpful (Zadrozny and Collins, 2018). More generally, other individuals not listed here could of course have participated in the writing.


This paper is available on arxiv under CC BY 4.0 DEED license.