GraphQL and Relay: what are they and why do they matter?

Written by pierrerognion | Published 2019/05/05
Tech Story Tags: facebook | graphql | react | relay

TLDR<em>At the F8 event, Facebook announced a complete overhaul of the Facebook app. The new app is not just a redesign, it’s powered by the latest technology, and built for scalability in order to deliver a better user experience.</em>via the TL;DR App

At the F8 event, Facebook announced a complete overhaul of the Facebook app. The new app is not just a redesign, it’s powered by the latest technology, and built for scalability in order to deliver a better user experience.

Facebook has over 2 billion monthly active users. Just think about it: 2 BILLION 😳. But, Facebook has issues — besides privacy, and misinformation. Indeed, the current Facebook is built on legacy infrastructure. And we have to admit it, it’s a bit cumbersome in terms of UI, and also regarding how the website is built. For instance, the soon to be old Facebook homepage has an uncompressed CSS file size of 2.4 Mb. If you want something to compare, it’s a bit like a bad Wordpress theme 🙈

A fresh start with React, GraphQL, and Relay

Facebook needed a fresh start in order to get rid of the legacy infrastructure. And that’s how they started to work on technologies such as React, GraphQL, and Relay.

The brand new Facebook design presented at F8 2019 is composed of a single webpage app powered by React, GraphQL, and Relay. At the time of writing, React has 128,393 stars on Github. So it has a huge community and many websites are built using this technology. I won’t comment it that much here, because the big announcements at F8 were concerning mostly GraphQL and Relay… which are used with React to simplify data fetching. Data fetching is what happens for example when you (the user) opens the Facebook homepage and your timeline loads. But what are GraphQL and Relay?

What is GraphQL?

**GraphQL is a declarative query language used for requesting data on a server.**It is agnostic to how you store your data in the backend and provide a unified type-safe layer of all your server data. It is made to be easily queried by multiple clients over time.

GraphQL is used to power the new Facebook.com. Have you wondered why my website loads so fast? It’s because I use Gatsby. After clarification on Twitter by @endiliey, it appears that Gatsby uses GraphQL only for build-time. So GraphQL acts as a data management layer. What makes this website fast is that it does various optimizations behind the scenes such as prefetching, pending navigation, image optimization, and so on.

By the way, if you want to test a website’s performance you can do this using web.dev. It’s a great (and free!) tool provided by Google. Here is my latest audit report:

OK, now back to GraphQL. Here is an example of the syntax that is included at the end of my blog posts pages:

export const pageQuery = graphql`
  query BlogPostBySlug($slug: String!) {
    site {
      siteMetadata {
        title
        author
      }
    }
    markdownRemark(fields: { slug: { eq: $slug } }) {
      id
      html
      timeToRead
      frontmatter {
        title
        date(formatString: "MMMM DD, YYYY")
        spoiler
      }
      fields {
        slug
        langKey
      }
    }
  }

GraphQL is essentially a query language for your API. GraphQL “provides a complete and understandable description of the data in your API, gives clients the power to ask for exactly what they need and nothing more, makes it easier to evolve APIs over time, and enables powerful developer tools (…) While typical REST APIs require loading from multiple URLs, GraphQL APIs get all the data your app needs in a single request. Apps using GraphQL can be quick even on slow mobile network connections.” And what about Relay?

What is Relay?

Relay is a JavaScript client used in the browser to fetch GraphQL data. Relay is a JavaScript framework developed by Facebook for managing and fetching data in React applications. It is built with scalability in mind in order to power complex applications like Facebook. The ultimate goal of GraphQL and Relay is to deliver instant UI-response interactions.

A Facebook page contains several elements and React components, which have different data requirements. GraphQL allows to list all the data we need to generate a page in a single top level query. So we save network from sending multiple requests by querying the data we need in a single round-trip.

But what if we want to use a component (for example a Facebook post) in several pages? We need to make sure that the queries are updated on all the pages where the component appears, otherwise we could end up over-fetching or under-fetching data. There is a sustainability issue, and that’s where Relay comes in. With Relay, components and their data requirements are no longer separate. The data requirements for the components are declared inside the components. This means that the components declare the data that they need.

Here is an example for a Facebook post component:

function Post(props) {

	const data = useFragment(graphql`
		fragment Post_fragment on Post {
			title
			body
		}
	`);

	return (
		<Title>{data.title}</Title>
		<Body bodu={data.body} />
	);
}

Relay helps to make sure that a component has the data that it needs when it renders. So we know what data are needed for any given page, even if the data requirements change over time. In the new Facebook, Relay is used everywhere.

Pre-fetching and parallelization of work with GraphQL and Relay

With traditional fetching, we:

  1. Download the code
  2. Render the code while fetching data
  3. Render the final content

GraphQL and Relay allow the parallelization of work with pre-fetching, and this allows faster loading times. Basically, we:

  1. Download the code and… fetch data at the same time
  2. Render the final content

Deferred queries with Relay

Relay also allows deferred queries. This is particularly useful to render a page like the Facebook news feed where you find a long list of posts, because you need to define priorities, in order to define some data as the most critical data that we need.

Here is an example of deferred queries with Relay:

fragment HomepageData on User {
	name // Most critical data
	first_post { ... } // Most critical data

	...AdditionalData @defer // Less critical data

}

By doing so, the server can deliver data as soon as possible in separate payloads without having the whole query to be completed. So hopefully it will be possible to show content sooner.

Scheduling delivery of data with data-driven code-splitting

There are many types of Facebook posts. They main contain text, pictures, videos, and so on. Components need to know when to render which variation of a post. Generally, we tend to download unnecessary information that we end up not using. It’s a waste of time for the end user… (and a waste of money for Facebook 🙊 ).

In order to avoid downloading all the resources upfront, a strategy is needed in order to download resources only when they are need.

With the traditional lazy fetching, you download the initial resources, and then download the additional resources needed in separate requests. But this can dramatically hurt page loading performance, and thus the overall user experience.

For example, if we use lazy fetching and the first post on the Facebook timeline is supposed to be a video, the additional resources need to render the video will be loaded at the end.

GraphQL helps as it allows to download the exact resources needed thanks to queries on matched types. How? By modelling the different variations of the UI and describe the types of post a post can be.

Here is an example of queries on matched types:

... on Post {

	... on PhotoPost {
		photo_data // Request photo data if the post is a photo post
	}

	... on VideoPost {
		video_data // Request video data if the post is a video post
	}

	... on SongPost {
		song_data // Request song data if the post is a song post
	}

}

But there is one issue here as we still need to wait for the moment we start rendering before we fetch the extra code that will load photos, videos, or songs in our example… Duh! And that’s where Relay comes in.

Relay adds data-driven code-splitting. This means that we can specify which component code we will need in order to render the data that matches a specific type. Here is an example:

... on Post {

	... on PhotoPost {
		@module('PhotoComponent.js') // Download the photo component code if the post is a photo
		photo_data // Request photo data if the post is a photo
	}

	... on VideoPost {
		@module('VideoComponent.js') // Download the video component code if the post is a video
		video_data // Request video data if the post is a video
	}

	... on SongPost {
		@module('SongComponent.js') // Download the song component code if the post is a song
		song_data // Request song data if the post is a song
	}

}

Thus, we can render the data much quicker. Let’s have a look at an additional example of code splitting with A/B experiments:

Here is the old way to do an A/B Experiment:

function MyComponent(props) {
	...
	if (InExperiment('AB')) // We determine that we are doing an A/B experiment in the middle of the code
		import('Feature'); 
	...
}

And here is the new way to do an A/B Experiment:

const Feature = importCond('AB', 'Feature'); // We do a quick check at the beginning of the request to see what the user needs
function MyComponent(props) {
	...
	if (Feature)
		Feature.use();
	...
}

Relay local cache

**Relay also keeps a local memory cache of the data that we have fetched so far.**So when we need to fetch data from the server, if we have data that is already stored locally, we can reuse it to instantly give feedback to the user when they perform an action and render the page. As more data is fetched from the server we can show more content.

Server-Side Rendering

Facebook is experimenting with various optimizations on the server side in order to load pages faster. Code size is analyzed for each pages in order to detect opportunities of code-size improvements. Regarding this topic, Facebook showed a tool called JS Graph Explorer. I understand the importance of having such a tool when you scale, but right now I am more interested in GraphQL and Relay 🙃.

CSS and Atomic Stylesheets

The old Facebook site was sending too much CSS. Over the years, it became really heavy. Remember, an uncompressed size of 2.4 Mb for the homepage 🙈.

This what CSS looks like on most websites and the old Facebook:

<Component1 classNames=".class1"/>
<Component1 classNames=".class2"/>

.class1 {
	background-color: var(--fds-active-icon);
	cursor: default;
	margin-left: 0px;
}
.class2 {
	background-color: var(--fds-gray-25);
	cursor: default;
justify-self: flex-start;
	margin-left: 0px;
}

The problem in the snippets above is that rules are duplicated throughout the stylesheets and that means wasted bytes. Instead, the new Facebook will generate atomic stylesheets, which means that each rules will be defined only once. Here is an example:

<Component1 classNames=".classA .classC .classD"/>
<Component1 classNames=".classA .classB .classD .classE"/>

.classA { cursor: default; }
.classB { background-color: var(--fds-active-icon); }
.classC { background-color: var(--fds-gray-25); }
.classD { margin-left: 0px; }
.classE { justify-self: flex-start; }

Consequently, no matter how many components and stylesheets we have on a website, the amount of CSS will plateau at a certain level.

Loading Experience improved with the new React Suspense component

The best practice is to load content is in the order that we read: top-down and left to right.

Historically, it was hard to load content and follow this best practice at the same time, but a new React component called React Suspense will help doing so.

Let’s say we have a post, and we wrap it in React Suspense boundaries like this…

<React.Suspense
fallback={<MyPlaceholder />}> // Loading state
	<Post> // Full content
		<Header />
		<Body />
		<Reactions />
		<Comments />
	</Post>
</React.Suspense>

… then React can coordinate if we show the loading state or the full content. The suspense boundary works like a JavaScript try-catch:

  1. React will try to render the content.
  2. If one of the components from the post hasn’t loaded yet (for example the body with an image)…
  3. … then the user will see the loading state with the placeholder.
  4. Then, when the content is ready, the loading state is replaced with the final content.

React boundaries simplifies how code is loaded. And the good thing is… they can be nested 🔥 So we can create the top-down and top to bottom experience that we are looking for.

CSS Variables

They can be useful for theming… and especially dark modes 😈

Here is an example of CSS variable:

--white: #FFFFFF;
color: var(--white);

Conclusion

A great User Experience (UX) starts with a great Developer Experience (DX). Companies like Facebook, Google, and Salesforce understand it very well, and they share a lot of their research work and best practices. Technologies like React, GraphQL, and Relay are accessible to everyone and that’s why I am writing about them here. Time will tell, but I think these technologies have a huge community and will power a lot of websites in the future. They have strong advantages. I am already using React and GraphQL a bit, but I am really interested in learning more about Relay and would like to get more experience by implementing it on my website as it will grow in terms of content. I have included links below if you want to dig deeper.

If you want more stories like this one, please check out my blog.

Sources and links


Published by HackerNoon on 2019/05/05