The Evolution of UX Personas: From Qualitative to Data-Driven Profilesby@inquiringnomad
183 reads

The Evolution of UX Personas: From Qualitative to Data-Driven Profiles

by Akis LoumpourdisJuly 29th, 2021
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read will feature iReporter photos in a weekly Travel Snapshots gallery. Please submit your best shots of our featured destinations for next week. Visit CNN next Wednesday for a new gallery of snapshots. Visit for a gallery next week for snapshots of places to go next week in the Submit photos of your favorite destinations to see next week's gallery next Wednesday. Submit your gallery next Tuesday for next next week. Submit your next destination.
featured image - The Evolution of UX Personas: From Qualitative to Data-Driven Profiles
Akis Loumpourdis HackerNoon profile picture


A persona in the context of User Experience(UX) is an abstraction of a person that is used to describe a real group of people and put a human face on it. It is a very popular and powerful tool for UX designers and marketing specialists. The concept of a persona has been around for many years and is considered a well-established strategy. Its use is not limited to just a brand’s customers (current or potential). The application domains can be e-health, cybersecurity, video games, software development, and more (Salminen et al., 2021). The more representative your defined personas are, the better they will help you to capture your customers’ needs, behaviors, and goals.

Traditional approach

The most common approach when defining personas is to build them based on some qualitative data collected by interviews, focus groups, etc. A typical persona entity would consist of information about the age, occupation, marital status, area of residence, life goals, motivations, frustrations of the person it represents. Based on the raw information, the next step would be to make some assumptions and flesh out possible traits of the person. (Fig.1).

Fig1. A flat persona profile (Jansen et al. , 2020)

This is a good start, but it is not enough. While it can definitely provide some insights, this type of persona doesn’t say much about the actual person. It fails to capture important underlying patterns of behavior and more often than not, it ends up being just a file stored in a hard drive. Remember, the business’s ultimate goal is to offer the best possible unique experience to its customers.

Data-driven personas have a different logic behind them, compared to the above-mentioned approach. Instead of a basic profile that is based upon hypotheses, real data is used in a more holistic way to create a more robust solution. It is essentially a more interactive decision-making tool (Jansen et al., 2020), much easier to be interpreted by the various teams involved.

The problems with the traditional approach

The personas paradigm shift - from the traditional, flat profile to the data-driven -, occurred naturally (albeit gradually) mainly due to the clear advantages of the latter.

Can be a long and expensive process

Manually collecting data and then creating personas can be very time-consuming as a lot of labor work is involved. This goes hand in hand with inflated costs. Data-driven personas, on the other hand, can be a much quicker task. If the available infrastructure allows it and the correct system is in place, the process can be partially or fully automated.

It is static

The behavioral patterns of a customer can change over time and the old ones can become irrelevant or even misleading. Data-driven personas can be updated with real-time figures and capture these changes as they happen.

They may not be representative of your audience

Usually, the traditional approach is based on a sample. Contrary, data-driven personas are a result of huge amounts of analyzed data, hence they tend to be more representative.


Data-driven personas are based on real data which is then used to generate segments with the help of algorithms. This increases the accuracy and facilitates the identification of some patterns that would otherwise be impossible to decode. With the help of unsupervised machine learning techniques, the clustering or grouping can sometimes lead to more meaningful conclusions compared to the grouping based on user-defined thresholds.

Data sources

So we are talking about big amounts of data here, but where should we be collecting it from? To successfully and accurately depict a persona, we need to consider a wide spectrum of sources, such as:

Online analytics

Can provide invaluable information on the interaction of the visitor with your website. Some examples are how much time they spend on the website, their journey, products, or articles they view, their device(s), etc.

Internal CRM

Having an up-to-date CRM with your audience data can also be a great asset. The information held in a CRM can be anything from simple demographics to how long they have been your customers or what is their total spending, etc.

Social media

I consider social media to be a gold mine of data. With APIs and tools available to extract public data and perform sentiment analysis, a company can have a great understanding of how their audience feels about their services and products, as well as about their competitors.


In their article A Survey of 15 Years of Data-Driven Persona Development, (Salminen et al., 2021), the authors summarised the most popular algorithms used to perform data-driven persona development (DDPD) in previously published work:

  • K-means clustering
  • Non-negative matrix factorization
  • Hierarchical clustering
  • Latent semantic analysis
  • Principal component analysis

The present and the future

I am a big advocate of the idea that tech practitioners should periodically check the latest research on their niche sector. There are teams of very smart people all around the world probably working on the next state-of-the-art tool or framework that can potentially enhance your work in the future.

It is a good habit to read what has been published, even if it is just the titles, and get a grasp of where research is heading to. As an example, I stumbled across the work of a team of persona researchers, that developed a promising system called Automatic Persona Generation (APG). Here is a list of their research work if you are interested.

According to them:

APG is a tool for automatically turning your user/customer data into personas. We call this "giving faces to data".

APG currently works with YouTube Analytics, Google Analytics, Facebook Ads, Facebook Insights, and in-house CRM data.

The system retrieves data from these sources and automatically generates user personas that represent central behavioral and demographic patterns.

APG uses several data science algorithms for this purpose.

A sample of the automatically generated persona can be seen in Fig2 (Jansen et al.,2020):

Fig2. Example of the listing of APG personas (screen left) and a displayed data-driven persona profile

That’s it for today, thanks for reading!


Jansen, B. J., Salminen, J. O., & Jung, S. G. (2020). Data-Driven Personas for Enhanced User Understanding: Combining Empathy with Rationality for Better Insights to Analytics. Data and Information Management, 4(1), 1–17.

Salminen, J, Jung, SG & Jansen, BJ 2019, The future of data-driven personas: A marriage of online analytics numbers and human attributes. in J Filipe, A Brodsky, S Hammoudi & M Smialek (eds), ICEIS 2019 - Proceedings of the 21st International Conference on Enterprise Information Systems. ICEIS 2019 - Proceedings of the 21st International Conference on Enterprise Information Systems, vol. 1, SciTePress, pp. 596-603, 21st International Conference on Enterprise Information Systems, ICEIS 2019, Heraklion, Crete, Greece, 5/3/19.

Salminen, J., Guan, K., Jung, S. G., & Jansen, B. J. (2021). A Survey of 15 Years of Data-Driven Persona Development. International Journal of Human–Computer Interaction, 1–24.