paint-brush
Building Secure Open Referral Systems: Harnessing HashMaps for Effective Fraud Prevention by@kaizenthecreator

Building Secure Open Referral Systems: Harnessing HashMaps for Effective Fraud Prevention

by Philip Ireoluwa OkiokioAugust 21st, 2023
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Word-of-mouth promotion is a powerful strategy in customer acquisition and retention. Referral systems are key in this context, divided into "Open" and "Closed" types. Closed systems involve a controlled process where leads are verified before incentives are given. Open systems allow easier access but risk misuse. Engineering-wise, preventing gaming requires encoding, serialization, and data structures like hash maps. Incentive tokens are generated for each click, reducing chances of manipulation. Frontend involvement includes token storage and inclusion in requests. Additional backend measures like using device identifiers can enhance fraud prevention. This fraud prevention system requires a strong backend and frontend integration.
featured image - Building Secure Open Referral Systems: Harnessing HashMaps for Effective Fraud Prevention
Philip Ireoluwa Okiokio HackerNoon profile picture

A potent method for encouraging individuals to engage in any activity is through word of mouth. It stands as one of the most impactful strategies for acquiring and losing customers or users swiftly.


Businesses aim to expand their customer base and reduce customer turnover. This goal has led to the exploration of different strategies. One prominent approach is advertising through various channels, both online and offline, which strongly influences people to consider the offered products or services. Another method I've come across is using incentives. This involves offering value to individuals, motivating them to engage in word-of-mouth promotion, and ultimately attracting new potential customers and users.


The word that easily conveys the marketing epistle above is referral systems, and having built a few code-led referral marketing campaigns, I am opining that there are two types of referrals which I consider “Open” and “Closed” referral systems.


A closed referral system

The term itself implies a sense of control, where information is shared and KYC (Know Your Customer) is performed for the newly obtained lead. Once this process is finished by the lead, the referrer receives the incentive. The closed system provides a way to prevent misuse and maintain control.

Open referral system

You can consider them to be like open season. There is no throttle of access and the conversion metric in this case is majorly getting people to land on a page, hoping that the page provides conversion or can lead people to perform a task without being prompted.


Enough of my shallow marketing information from here. Engineering-wise, The Closed Referral System seems to offer more control and suggests gaming can be prevented or controlled. Although I agree the engineering piece below will harden the closed system.


Some engineering finally.


Engineering device tracking


Here, I will provide a user story where your implementation would be solely up to you.


Open Referral Campaign - User Story

In a presale campaign, a product affiliate link is used to showcase the product to potential customers. The goal is to attract traffic, and each marketer who generates traffic becomes eligible for a publicly agreed incentive, as outlined in the Terms and Conditions.


In this case, we will be marketing Spotify and trying to increase acquired users in Nigeria.


I assume that we've implemented this code in our preferred programming languages. We have a prototype in place, and when server requests are made, we guide them toward the final objective while also maintaining our internal records.


However, we are aware of the possibility of misuse, where if someone uses the same device to make requests multiple times, the count increases accordingly.


An illustration

So, let's assume I am trying to get people to use Spotify such that if I get a 100 referral count my incentive is free Spotify premium for a Family circle.


my current referred count with my Spotify link


Yes, it makes sense to use a closed system such that the new lead joins Spotify and then the referrer gets his count increase. But, let's assume that it is an open referral campaign, I could click on my link 92 times to get to the score/statistics that I am interested in.


Case in point: my count increased to 9 because I clicked on the link.


my currently referred count increased by 1.


This poses a challenge, how can we prevent gaming?


I present to you encoding-serialization and data structures. I used a hash map as the source of the truth that keeps track of the data. it is adaptable to all cases so let me explain my reason for a hashmap.


FYI: hashmap is a Dict in Python, Object in Js, Struct in Go etc.



Time and Space Complexity for a HashMap


The Time and Space Complexity of this data structure is the reason for my decision, the keys being unique and values being extensible (malleable, it is another data structure). The only thing that could grow is the space of the data, lookups are at constant time, same with inserts which means that I could look up and if the lookup is false I could insert.

Else Predication

This is from line 33 in the gist above. So if gaming_data is empty that means that the device that clicked on the link is a unique visitor and as such the click is considered legit and not gamed. We can increase the count score and also provide the end-goal URL if there is any, if not the URL would be a string of length 0 specified on line 4.

If Predication

There are two cases possible here.


1. The data of interest is not in the dict and then we assign the data in dict.

2. The data exists in the dict but in the value (which is also a dict) the value of interest.


(link_code) does not exist. When this is the case we add the data to the value. The reverse is when it exists in that case we get the URL and return it.


This function just manipulates data, outside of this function there are a few things that can be/should be done. Updating the record in the DB based on the fact that the URL length is > 0. Another way to do the following statement is if the URL length is 0 throw an HTTPException, if this condition does not run then update the record score and return the URL.


This implementation requires a process to close it out, the first thing is to encode the dict to a string format. Here is an example below:


encoding the gaming data to a string and returning this data along with the URL



I used a package in Python called ItsDangerous to convert the gaming_data dict to a string that decodes back to the data structure of its input.


Now we have concluded the backend’s part of this solution, our system is meant to expect this gaming token to come with the request from the client side to the backend. So I will leave how you attach the data to your discretion, there are numerous decisions that can be made, query params or headers.


The Frontend/Clientside is the gatekeeper of this implementation, to explain what occurs let me show you the response now with this implementation.

request sent to my server via PostMan returns the URL and gaming-encoded data


In this response, we return the URL and a token to the client side, the clientside could help us store this token in the device of whoever clicks our link. The implementation of the request from the frontend to the backend should check for this token, if this token exists we then include it in whatever method we decide. So the Count should have increased to 10 with the request above.


The count for the user is gotten and it increases to 10


Here is how I specified for the token to be sent back to my system.


Included the gaming_data has a header


If we check the score for the owner of the link it should freeze at 10 and not increase (fingers crossed).


The score is frozen because the same link is clicked with the same device


The count is frozen. One thing that should be pointed out is that we would always provide a token irrespective of the data in the token, as such we can continually rotate the token.


The above is how I architected a fraud prevention system, that involves backend heavy lifting and syncing with the frontend.


There is a potential way to persist this data in the backend. Recently, I attempted to explore user agents, but I just learned that user-agent can be spoofed so I am looking for a fail-proof way to identify devices.


If that is possible we can persist this token in a table and the identifier can be stored in a device table and foreign_keys could be used to relate between them or whatever method you would prefer.


The fraud system layered on the closed referral system reduces the chances of gaming and even my suggestion above could be implemented especially since we know the primary key of each user, the referer, and the new lead.


Till next time, peace!


Also published here.