Targeting and fake followers: how can we solve the most acute problems of influencer marketing?

Influencer marketing prospects really impress! This market has doubled for the last year and a half and currently amounts to over $2 billion. No wonder! Bloggers give brands access to a huge audience on social networks. Instagram bloggers lead by a wide margin: 69% of marketers promote products with their help. However, there are some obstacles to the growth of this market, because of which small business almost does not risk to use the services of Instagram stars. Their prices are crazy high, guarantees are not always provided, and the selection of relevant influencers is too complicated. Data scientist Arthur Suilin presents problems with the new marketing industry and their solutions. Editor and co-author . Egor Perezhogin Stubborn Instagram stars with crazy high prices Advertising on pages of famous influencers costs a staggering sum. For example, Kylie Jenner reportedly makes fabulous $1 million per paid Instagram post. TV advertising with a Hollywood actress is way cheaper. But not only influencer prices scare. Top-influencers themselves behave worse than capricious stars, and they get away with it… Screenshot © Instagram / Luka Sabbat Luka Sabbat was sued for failing to live up to an agreement to promote Snap Spectacles on his Instagram account. PR Consulting Inc. seeks reimbursement of $45,000 paid upfront plus another $45,000 in additional damages. According to the agreement, the 20-year-old Instagram blogger was obliged to make four posts with Snap Spectacles on his page with 1.4 million followers. But he failed to make some posts and did not provide analytics to PRC for his first Instagram Story. Even if Sabbat loses this case, he strategically wins. The scandal only raises his fame: now he has almost 2 million followers, therefore his prices have increased. Simple fake follower tricks, or a potato with 10 thousand followers Lena Katz, a branded content strategist, conducted a demonstrative experiment. She photographed a simple baked potato, created her an Instagram account and within a couple of weeks provided it with 10 thousand fake followers. The vegetable became a star, having received a bunch of likes and comments. Any Internet marketer can look at this page and think, "For some mysterious reason, people are following the potato, it has influence." Lena , laughs "Actually, I just bought all those subscribers and their engagement." Screenshot © Instagram / PotatoMcTato The engagement illusion is created not only through the direct purchase of subscribers. Bloggers unite in mutual support groups, where everyone follows each other, zealously stamping comments and likes. Mommy-bloggers are especially vigorous in this matter. 1000 Alexis Baker VS 1 Kardashian: numerical advantage The more subscribers, the less their engagement: experts have proven that this simple formula works flawlessly. Source: https://influencermarketinghub.com/influencer-marketing-2019-benchmark-report/ Thus, we are gradually approaching the fact that a nano-influencer (with <10 thousand subscribers/followers) has undeniable advantages as an ambassador of your product on Instagram. Subscribers trust him more because among them there are many friends, relatives, acquaintances, and people who know him personally; Involvement/engagement of such subscribers is almost 7 times higher compared to top influencers; Prices of nano-influencers are low enough. Many of them will agree to work for free, simply for samples of your products. Obvious.ly Marketing Agency CEO Mae Karwowski aptly remarked, "You’re able to place a lot of really small bets rather than, ‘We’re going to work with Kim Kardashian.’" Screenshots © Instagram / kimkardashian / alexisbakerrr Left: Kim Kardashian — 146 million subscribers, post price from $500000 Right: Alexis Baker — 3262 subscribers, post price equal to a piece of pizza Obviously, the nano-influencer marketing strategy is promising for large brands, interesting for medium-sized companies, and potentially irreplaceable for small businesses. But there is a problem. How to obtain nano- and micro-influencers with an audience that is relevant to you… An example of how to waste your advertising budget The collapse of the advertising campaign launched by the cosmetic brand Clarins is indicative. Its marketing experts paid for posts of 4 top-influencers, girls from the UAE (with 100+ thousand subscribers). Only two of them provided an acceptable ROI. Deep.Social Company, where the author of this article previously worked, conducted an in-depth analysis of the pages of «failed» bloggers. The failure was explained unbelievably simply. 80% of their followers consisted of men from third world countries. Such subscribers, in principle, could not spend their money on cosmetics. In other words, influencers were chosen incorrectly. Noorstars — one of the most popular influencers from the UAE | Screenshot © Instagram / noorstars In fact, most of the problems of promoting products through Instagram influencers can be reduced to only two purely technical issues: high-quality targeting and identifying fake followers. Marketing experts advertising through Google AdWords, have the option of focusing their campaigns by keywords, targeting an arbitrarily narrow thematic segment of an audience. Instagram does not have this feature but offers billions of hashtags and millions of influencers, many of which have fake followers. How to choose the most relevant bloggers whose audience is real and suitable for your particular product? Existing techniques are imperfect. Let us consider them in detail. We will start with targeting issues. Targeting: problems and solutions Impassable jungles of thematic analysis The obvious way to select the right bloggers is to use . They are used by most advertising systems, including Facebook Ads. The creation of such trees-networks is time-consuming and is carried out in two stages: corresponding thematic trees-networks Marketing experts manually compose a thematic/topical tree. For example: topic . Obviously, a Thai restaurant in London’s Walthamstow district needs not just , but the entire thematic chain and so on. Actually, the number of topics interesting to advertisers is equal to infinity. The narrower the niche in which they work, the more detailed division by topics they require. Thus, the tree grows enormously. 1: Meal Meal Meal: Restaurants: London: Walthamstow This subjectively «grown» tree is transformed into a network allowing catching bloggers tagged with required topics. The more the number of selected topics, the more difficult it is to find a blogger corresponding to all of them. To correlate each blogger with each topic of the tree, it is necessary to use manual manipulations or machine learning systems. In both cases, due to the initial subjectivity of the tree, the risk of human errors is very high. 2: The more the number of topics, the larger the tree grows, the more it takes to keep the tree-network up-to-date (new trends arise each day, new topics emerge and old topics drown), and to update corresponding training samples for machine learning. Small thematic trees with truncated branches are inflexible to changes and provide unacceptably coarse filters for finding bloggers. Hashtag focusing will not work What if you focus your advertising campaign on Instagram using a set of keywords, like in AdWords, but with hashtags instead of keywords? This solution is obviously wrong: 1. Bloggers writing about cars, post tens and even hundreds of different hashtags: , etc. #car, #auto, #fastcars, # wheels, #drive, #bmw, #audi 2. Bloggers can use the hashtag incidentally once, for example, when they take photos of an interesting car. #car 3. Bloggers can use popular hashtags such as just to draw attention to their posts, without any specific meaning. #cat The selection of bloggers by their tags will not work correctly. Smarter methods are required in this case. Thematic modeling: impressive theory and mediocre practice Modern techniques of natural language processing have a subject area named . Let us consider a very primitive social network on which people have only two basic interests: and . An the analysis of the of these interests using a scale from 0 to 1, any hashtags used by bloggers on this virtual social network can be placed in the following 2D diagram. topic modeling Food Japan power Sample topical 2D diagram Obviously, any hashtag is described by a pair of numbers (from 0 to 1) corresponding to the X and Y coordinates in the 2D thematic/topical space. Using the diagram, it is possible to calculate a (with «averaged» coordinates in the topical space) of a particular post with several tags. The centroid coordinates along the X and Y axes correspond to the relevance of the post by topics and accordingly. The closer the coordinate is to 1, the higher the relevance. Thus, having calculated the centroid of all posts of a particular blogger, it is possible to understand what topics are generally relevant to his/her content. centroid Food Japan In real topical modeling, not two, but hundreds of topics are used, and corresponding tags exist in a high-dimensional space. Let us look at the following table with simulation results by 15 topics using the library. BigARTM Topics | Top tags As you can see, a reasonable structure is clearly traced, but the topics are far from perfect. The reason is that thematic modeling is intended to work with documents containing hundreds and thousands of words. In our case, most posts have only 2–3 tags. As a result, for the project, our development team chose a different modeling method that is simpler and more powerful at the same time. Prometeus Network TopicTensor model: theory The key thematic/topical modeling advantage is the interpretability of results. Any word/tag from a post can be weighed with a special diagram showing how close this post is to each considered topic. But this plus turns into a minus since it limits the number of considered topics. Meanwhile, their number on Instagram is almost infinite. Therefore, after the removal of the fixed number of topics requirement, machine learning for selection bloggers becomes much more efficient. We get a model that is close enough by its essence to well-known . Each tag is represented as a vector in N -dimensional space: Word2Vec The degree of similarity (i.e. how close are the topics) between tags and can be calculated as a dot product: w w′ as Euclidean distance: as cosine similarity: The task of the model during learning is to find such tag representations that will be useful for one of the predictions: Based on one tag, predict what other tags will be included in the post (Skip-gram architecture)Based on all post tags except one, predict the missing tag (CBOW architecture, “bag-of-words”)Take two random tags from the post, and based on the first one, predict the second All these predictions boil down to the fact that there is a target tag which needs to be predicted and context represented by one or more tags included in the post. The model should maximize the probability of the tag depending on the context; this can be represented as a softmax criterion: c But calculating softmax across the entire set of tags is expensive (a million tags or more can participate in learning), so alternative methods are used instead. They boil down to the fact that there is a positive example W which must be predicted, and randomly selected negative examples exemplifying how not to predict. Negative examples should be sampled from the same tag frequency distribution as in the learning data. Loss function for a set of examples can take the form of a binary classification (Negative sampling in classic Word2Vec) or work as a ranking loss, comparing “compatibility” by pairs with the context of positive and negative examples: where (⋅,⋅) This is a ranking function, which often use max margin loss: l The TopicTensor model is also equivalent to matrix factorization, but instead of the “document-word” matrix (as in topic modeling) here the “context-tag” matrix is factorized, which in some types of predictions turns into a “tag-tag” tags mutual occurrence matrix. Practical implementation of TopicTensor V1.0 Several possible ways to implement the model were considered: code, code, library, library. The last option was chosen, as requiring minimal effort on revision (all the necessary functionality is already there), giving high quality, and almost linearly parallel to any number of cores (32 and 64-core machines were used to speed up the learning). StarSpace by default uses the max margin ranking loss loss and cosine distance function as a metric of proximity of vectors. Subsequent experiments with hyperparameters showed that these default settings are optimal. Tensorflow PyTorch Gensim StarSpace Results The resulting embeddings showed excellent separation of topics, good generalization ability and resistance to spam tags. A demo sample of the top 10K tags (English only) in . Clicking on the link, you need to switch to mode (tab in the lower left) and wait for about 500 iterations until the projection in 3D is built. View better in mode. If you do not want to wait, in the lower right corner there is a section , select in it, then the calculated projection will be loaded immediately. is available for viewing Embedding Projector t-SNE Color by = logcnt Bookmarks Default Topics formation examples Let’s start with the simplest. Set the topic by one tag and find the top 50 relevant tags. Topic set by the tag #bmw Tags are colored according to relevance. Tag size is proportional to its popularity. As you can see, did a fine job with shaping the topics and found many relevant tags that most people don’t even know exist. TopicTensor BMW Topic, set by tags #bmw, #audi, #mercedes, #vw Let’s complicate the task and form the subject of several German auto-brands (find the tags that are closest to the sum of the vectors of input tags): This example shows ability to generalize: understood that we mean cars in general ( tags, tags). And also understood that the topic should be given preference to German cars (tags circled in red), and he added the “missing” tags: (also German auto brand), and options for writing tags that were not at the input: , and TopicTesnor’s TopicTensor #car #cars #porsche #mercedesbenz #benz #volkswagen. Topic, set by the tag #apple Let’s complicate the task even more, and create a topic based on the ambiguous tag, which can designate both a brand and just a fruit. It can be seen that the theme of the brand dominates, however the fruit theme is also present in the form of tags , and . #apple #fruit #apples #pear Let’s try to highlight a clean “ ” theme, for this we add a few tags related to the apple brand, with a negative weight. Accordingly, we will look for tags that are closest to the weighted sum of the input tag vectors (by default, the weight is equal to one): fruit It can be seen that negative weights removed the brand theme, and only the fruit theme remained. Topic, set by the tag #mirror is aware that the same concept can be expressed by different words in different languages, as can be seen in the example with . The English and came up with: and in Russian, and in Spanish, and in Portuguese, and in Italian, and in German. TopicTensor #mirror mirror reflection зеркало отражение espejo reflejo espelho reflexo specchio riflesso spiegel spiegelung Topic, set by the tag #boobs The last example shows that casual themes work as well as branded ones :) Selection of bloggers For each blogger, his posts are analyzed and the vectors of all tags included in them are summarized. where |p o s t s| is the number of posts | t a g s i | is the number of tags in the i -th post Result vector is the topic of a blogger. Then there are bloggers, topic vector of which is closest to the topic vector defined by the user. The list is sorted by relevance and is given to the user. Additionally takes into account the popularity of the blogger and the number of tags in his posts, because otherwise, bloggers who have one post with one tag specified by the user at the entrance would come to the top. The final score by which bloggers are sorted is calculated as follows: β where λ, ϕ, τ are empirically selected coefficients lying in the interval 0 … 1 Calculating the cosine distance across the entire array of bloggers (several million accounts participate in the selection) takes considerable time. To speed up the selection, the (Non-Metric Space Library) library was used, which reduced the search time by an order of magnitude. NMSLIB pre-builds indices on the coordinates of vectors in space, which makes it possible to calculate the top close vectors much faster, calculating the cosine distance only for those candidates for whom it makes sense. NMSLIB Topic Lookalikes Vectors calculated for the selection of bloggers can be used to compare bloggers with each other. In fact, is the same selection of bloggers, but instead of a vector of input tags, themed vector is served user-defined blogger. The output is a list of bloggers whose topics are close to the subject of a given blogger, in order of relevance. β lookalikes β Fixed topics In , as already mentioned, there are no explicitly defined topics. Nevertheless, the correlation of posts and bloggers with a fixed set of topics is necessary, to simplify the search, or to rank bloggers within individual topics. The problem arises of extracting fixed topics from the vector tag space. TopicTensor To solve this problem, was chosen to avoid subjectivity in determining possible topics, and to save resources, because viewing hundreds of thousands of tags (even 10% of them) and assigning them topics is a lot of manual work. unsupervised learning The most obvious way to extract topics is to cluster the vector representation of tags, . Clustering was carried out in two stages, because Algorithms that can effectively search for a cluster in 200D space do not yet exist. one cluster = one topic At the first stage, dimension reduction was carried out using the UMAP technology. This technology is in some sense improved t-SNE (although based on completely different principles), works faster, and better preserves the original data topology. The dimension decreased to 5D, cosine distance was used as the distance metric, the remaining hyper parameters were selected according to the results of clustering (second stage). An example of clustering tags in 3D space. Different clusters are marked with different colors (colors are not unique and can be repeated for different clusters). At the second stage, the clustering was performed using the algorithm. The results of clustering (in English only) can be . Clustering has allocated about 500 topics (the parameters of UMAP and clustering can regulate the number of topics within wide limits), while the cluster got 70% -80% of tags. A visual check showed good coherence of topics and the absence of a noticeable correlation between the clusters. HDBSCAN seen on GitHub However, for practical use the cluster needs to be improved: to collect a tree from them, to remove useless ones (for example, a , a , a ), to merge some clusters into one topic. cluster of personal names cluster of negative emotions cluster of commonly used words TopicTensor V2.0 — possible improvements The main disadvantage of TopicTensor V1.0 is its coverage far from 100%. Not all bloggers use hashtags, and not all who use hashtags — write something meaningful. There are three main ways to expand coverage: 1) Analyzing photo content. The theme of the blog is clearly defined by the photos (in fact, they set it up), so the computer vision model, trained to issue its thematic vector from a photo, could partially replace the tags. 2) If we assume that bloggers with a similar audience should have a similar topic, we can display the topics of bloggers who do not use tags through the audience , if there are bloggers with similar audience and tags. lookalikes 3) Analyzing text content of the post. What is the difference between fake likes and fair likes? Fair likes are sent by people who really like a particular post. It resonates with them because their sphere of interests is close to a particular topic or blogger’s personality. Fake likes are created by people who actually do not care a particular post. How to identify whether a person really likes a particular post topic? If you analyze types of likes he/she places, his/her subscriptions, etc., then using machine learning (AI) with good enough accuracy you can calculate of his/her likes under a particular post or of his/her subscription to particular blogger. *probability* *probability* If a like is placed by a real/fair account, its is high. If a like is created artificially from an account managed by Internet cheaters (fake followers), its is low. Thus, it is possible to trace, for example, cases when a young mother (who usually should like household and raising children posts) suddenly likes various fishing equipment. *probability* *probability* Accordingly, an account with a higher (on the average) of likes/subscriptions is more respectable, than an account with a low . *probability* *probability* This method of analysis of quality of accounts is absolutely objective and representative since the AI-model is trained on the data of existing subscriptions and likes, on samples throughout Instagram, without any manually created heuristic rules. Conclusions & suggestions The development of influencer marketing is currently limited only by technical barriers. In fact, developers have not yet provided marketers with effective tools for selecting relevant bloggers and identifying fake followers. Thus, the risk of wasting an advertising budget is high enough. TopicTensor model (used in Prometeus network) claims to be the right solution for this problem. Test it for your benefits at the following addresses: , . http://tt-demo.suilin.ru/ https://demo.prometeus.io/ Editor and co-author . Egor Perezhogin