How HackerNoon's New Dynamic Sitemap Improves Story Distribution  by@fabian337

How HackerNoon's New Dynamic Sitemap Improves Story Distribution

image
Fabian337 HackerNoon profile picture

@fabian337
Fabian337

A developer in the green world of HN.

SEO stands for Search Engine Optimization and it helps your website become more discoverable to the public by making it rank higher in search engines such a Google. You could add SEO to your website, in a dynamic or static way, but both will definable level up your site. This is the story of how I added a new type of a dynamic sitemap to HackerNoon’s publishing platform to better index our hundreds of thousands of site pages: https://hackernoon.com/sitemap.xml

So how do Dynamic Sitemaps Drive SEO Performance?

To start indexing a site in search engines all you need is a sitemap.xml file on the /public directory of the source code of your website. The sitemap will indicate the different pages of your website and send it to search engines to validate the urls. There are actually a lot of other websites that could generate sitemaps for free and all you have to do is copy and paste that file to the /public directory, then deploy the site. This is called static sitemaps, since no new pages are being created.

At HackerNoon, we used to have what is called dynamic sitemap per deployment. Using this method, a script is created in the application source. This script will generate a .xml file with all the corresponding pages which will then update when the application finish the deploying process. This method is fine when you do not have a lot of new pages being created (the demands are not hit) and make a lot of changes to the code. This method can however, increase deployment time as an additional script will be running where all the data needs to be fetched to produce the new .xml file.

Unfortunately (or fortunately - it depends how you look at it), HackerNoon’s demands for article distribution was a little be too high for Dynamic Sitemap Per Deployment to handle, since new pages are being created all the time in high volume. Thus, a new iteration was needed in order to fix the matter.

I researched a lot about how to index a page the moment it’s created, but all resources suggested the “script method.” That is when after sometime, it came to my head of having dynamic routing with dynamic sitemap. This happened to be the solution, but there is a bit more to it…

Let Me Explain:

This method is really similar to the one above but has a more complex implementation due to the integration of a Database and Dynamic routing. It will basically let you create different sitemaps pages in a dynamic manner and allow you distribute every page the moment it’s created.

Firstly, I used a database to store all of HackerNoon’s pages using hooks. Basically when a new article, profile, about page, tagged page, or company page gets created, a hook runs in the server where a bunch of code gets updates, including updating the sitemap database. It finds an available sitemap, and then generates the url and add it to the database. The data is split into different routes/slugs which is what allows it to be dynamic.

Secondly, thanks to NextJs and its approach with serverSiteProps and StaticProps, I was able to create a new route /sitemaps and add dynamic routing with slugs /sitemaps/[slug]. The slug is created on the database and then fetched using serverSideProps which will fetch the corresponding data indicated by the slug of the route. NextJS has a package (next-sitemap) that lets you render xml content into any route, and so I used that to create a new page with the sitemap content. Once the sitemap content is rendered, it will be picked up by searching engines.

Here is a codeblock of the implementation on the client:

export async function getServerSideProps(ctx) {
  const {slug} = ctx.params; //get the slug of the route
  //make the api call to get the data, we are using firebase
  const sitemapPages = await db.collection("collection").where("slug", "==", slug).get();

  let fields = [];

  if (!sitemapPages.empty) {
    sitemapPages.docs.forEach(doc => { // loop through each document gathered
      const {pages} = doc.data(); //get the pages of a document
      if (pages.length > 0) {
        fields = [...fields, ...JSON.parse(pages)]; //add the pages to fields
      }
    });
    return getServerSideSitemap(ctx, fields); //fields is rendered in xml format
  }
}

The Challenge:

The main challenge with this implementation was the database setup. I didn’t even know if my approach was going to work, but it was worth a try since there were many tickets/tech support about it. It required me to draw a few databases by hand and basically try each and see which one was going to work. I ended up using a clever method which is basically to split the urls into a set of 5,000 per document and give it a unique slug until they add up to about 35 or 40,000 urls. Having a larger payload could crash the client request, which will not let the urls to be indexed. Additionally, Google Search Engine have a limitation of 50,000 urls per sitemap, so I figure 40,000 urls will be good for HackerNoon.

Thanks to this implementation, we have 7 different sitemaps with about 40,000 pages each, and adding new ones automatically every day: https://hackernoon.com/sitemap.xml

image

Dynamic Sitemap Conclusion:

When I started working at Hackernoon, SEO was a topic that needed some improvements, and because we have so many pages, I was able to implement Dynamic Sitemap with NextJS and Firebase. Every page that gets created on HackerNoon, is now indexed by Search Engines without a sweat of work. My goal was to minimize manual labor (sitemaps per deployment) and implement a method that will work by itself. Check out our dynamic sitemap: https://hackernoon.com/sitemap.xml

Comments

Signup or Login to Join the Discussion

Tags

Related Stories