Junior Data Scientist
So I decided to construct a Search Engine Optimization (SEO) handbook for beginners, with a focus on organic search results.
Why Did I Write That? Because I wanted to understand what all the hype around SEO is about. I then kept notes for myself, and finally, I thought to share!
As Seth Stephens-Davidowitz has stated:
"Google searches are the most important dataset ever collected on the human psyche."
By that he meant that, in the context of Big Data and by collecting and analyzing data derived from Google searches, we can analyze -and even predict- people's behavior, needs, trends, motivations, and the list goes on.
Definition: In simple terms, SEO consists of the steps and processes undertaken so that the search visibility and ranking of your website by the search engines can be increased. In other words, you can use SEO to show up in the Search Engine Result Pages (SERPs) at a high-rank position, and so your potential customers/stakeholders can find you.
Types: There are two main SEO types: Organic (natural) SEO, and Non-organic (paid / artificial) SEO.
Comparison: The main advantages of organic SEO are that you don't have to pay for ads and at the same time you attract relevant users, i.e. users that are really trying to search for similar content, product, or services of yours. On the other hand, by PPC you can see instant results by attracting "ready-to-buy" users, however, it might not be a good long-term strategy move.
A successful SEO strategy can be achieved by optimizing both for the search engines and the surfers/consumers. By “search engines” it is meant the technical part which is interrelated with “Information Retrieval”. By “consumers”, we are interested in the human perspective and element which can be studied by Information Behaviour and Information Seeking theories. Specifically for the latter case, we want to answer questions like “how do people start a search”, “how do users seek information, and how do they utilize it”, and finally, “what types of search engines require different solutions”.
There are lots of things to study around IR, and someone can start by exploring the PageRank algorithm, Zipf's law, and by understanding the concepts around stopwords and stemming. However, I would like here to point out the terms “description” vs. “discrimination” of a document, and the main recommender systems evaluation metrics which are precision and recall. Why is that important? Because they are correlated with the website’s relevance and authenticity, the latter of which, subsequently, affects your search engine rankings and your position at SERPs.
It is determined by various factors such as:
A tricky concept about relevance in IR is that we want our document to have a good description regarding its content (
), but at the same time we also want that document to be discriminated against other documents (
). The problem here is that if we try to describe our document with 'common sense' (i.e. in a way that everybody would describe it) then we would probably not achieve satisfactory document discrimination because this is how everybody described similar documents as well!
Authenticity in the context of SEO is known as Domain Authority. Essentially, it is a measure of how authoritative your domain is. Contributing factors include:
Precision in the context of IR:
It tells us how useful the results are (effectiveness in terms of the given results).
It tells us how complete the results are (completeness in terms of the given results).
Ok, let’s now dive into SEO in more detail. We initially discriminated the main SEO types, and in the following part, I will be focusing on Organic SEO. It is important to first understand the ranking factors on search results. Some of them can be depicted below hierarchically based on their importance:
During the implementation of an organic SEO strategy, you will most likely find yourself focusing around: keywords and content.
The main themes here are: keyword attributes, keyword research, and keyword distribution.
1. Keyword Attributes
2. Keyword Research
Here you try to extract insights about your website (e.g., Google Search Console), and you want to discover search volume metrics by giving answers to questions such as “What is the current state of demand for my particular keywords?”. It might also help your research to proceed to keyword categorization by clustering your keywords into their main topics.
3. Keyword Distribution
It is the procedure of how you will assign and distribute your specific keywords across your website's pages.
Let’s have a look at an example with the query “Data Science”. The following image is derived from https://answerthepublic.com/, the latter of which can produce four analytics insights being “questions”, “prepositions”, “comparisons”, and “related”. The below image is with regard to the “questions” category. The greener the dot, the higher the search volume for those queries.
One other thing you can do is to compare multiple queries together. Below you can see the comparison between the terms “Data Science”, “Machine Learning”, and “Artificial Intelligence” generated by Google Trends. You can easily notice that, contrary to Data Science, AI was more popular at the beginning of the timeframe (the year 2004), whereas in the last years the popularity of AI has plummeted compared to “Data Science (2nd)”, and “Machine Learning” (1st).
Nevertheless, you should be careful not to reach a conclusion so fast! I couldn’t believe that the search interest in AI has been reduced, especially during the last years. Hence, although there are lot’s of parameters you can play around with and tweak in the Google Trends platform (location, timeframe, web search type (images, news, google shopping)), I finally found out that if you replace “Artificial Intelligence” with “AI” you will find that “AI” had always been in the first place!
Content is everywhere and you can optimize it both with On-Page and Off-Page SEO.
1. Improve your "src" and "alt" HTML attributes
1. Construct HTML and XML sitemap
1. Descriptive but short, and concise as possible
2. Fix your redirect issues (suitably use the 301 and 302 redirections)
I hope your website gets search engine optimized!
Also published here