paint-brush
Analyzing QDrops Texts shows Distinct Linguistic Features than Traditional Onesby@ethnology

Analyzing QDrops Texts shows Distinct Linguistic Features than Traditional Ones

by Ethnology TechnologyDecember 7th, 2024
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

The study of the Q drops raises a number of specific challenges. It follows specific rules that most people would not use in another context. Structural brevity of the sentences prevents us from taking the sentence lengths as a clue of who wrote what. The overwhelming proportion of interrogative forms makes it difficult to reason on morpho-syntactic sequences. The elliptic style of theQ drops, written almost as if they were a telegram, distorts the use of function words.
featured image - Analyzing QDrops Texts shows Distinct Linguistic Features than Traditional Ones
Ethnology Technology HackerNoon profile picture

Authors:

(1) Florian Cafiero (ORCID 0000-0002-1951-6942), Sciences Po, Medialab;

(2) Jean-Baptiste Camps (ORCID 0000-0003-0385-7037), Ecole nationale des chartes, Universite Paris, Sciences & Lettres.

Abstract and Introduction

Why work on QAnon? Specificities and social impact

Who is Q? The theories put to test

Authorship attribution

Results

Discussion

Corpus constitution

Quotes of authors outside of the corpus have been

Definition of two subcorpus: dealing with generic difference and an imbalanced dataset

The genre of “Q drops”: a methodological challenge

Detecting style changes: rolling stylometry

Ethical statement, Acknowledgements, and References

The genre of “Q drops”: a methodological challenge

The study of the Q drops raises a number of specific challenges. First of all, the Q drops constitute per se a kind of a genre. It follows specific rules that most people would not use in another context: they do not look like a regular blog, media or social media post, nor do they belong to any specific literary genre etc. This forces us to consider our attribution problem as a cross-topic attribution problem.


This specific genre has consequences on many linguistic properties of the Q drops. The structural brevity of the sentences for instance prevents us from taking the sentence lengths as a clue of who wrote what. The overwhelming proportion of interrogative forms, especially in the first Q drops, makes it difficult to reason on morpho-syntactic sequences, as they are often extremely and artificially stereotyped. Part-of-Speech n-grams such as “Interrogative pronoun - conjugated verb - common noun” would for instance emerge as a signature of the Q drops, and could derail our analysis, by pointing to any of the suspects using interrogative forms the most in other contexts. Finally, the elliptic style of the Q drops, written almost as if they were a telegram, distorts the use of function words, less frequent than expected in the Q drops corpus. Approaches relying only on function words could be made less robust by this distortion.


We thus chose to work on character trigrams, the most flexible and reliable feature we could use in the very specific context of this study, and a widely acknowledge feature in stylometry, in particulary for its supposed capacity to capture grammatical morphemes (Kestemont, 2014; Sapkota et al., 2015) while bearing in mind potential greater sensibility to thematic attractions in comparison to function words.


This paper is available on arxiv under CC BY 4.0 DEED license.