Manipulating data is a core skill for any developer. In an API-driven environment, so much of the data you receive is formatted in a way that doesn't directly match the way that your application or UI needs it. Each web service and third-party API is different. This is where the ability to sort, normalize, filter, and manipulate the shape of data comes in.
In this article, we'll explore some common ways to work with data in Javascript. To follow along, you'll need to be working with code in either Node.js or the Browser.
Before we start, you need some data. For the rest of the examples in the article, we'll be using the data returned from a search of GitHub's v3 REST API. We'll use the search/repositories endpoint to make a query for repositories matching a search term (the q parameter, in this case set to bearer). We're also going to limit the number of results to 10 per page, and only one page. This makes it more manageable for our examples.
Start by using Fetch to connect to the API, and wrap it in a function with some basic error handling. You can reuse the function later in each of our examples.
const apiCall = () => fetch('https://api.github.com/search/repositories?q=bearer&per_page=10').then(res => {
if (res.ok) {
return res.json()
}
throw new Error(res)
})
.catch(console.err)
If you're using Node.js, you can use the node-fetch package to add Fetch support. Install it to your project with npm install -S node-fetch. Then, require it at the top of your project file.
const fetch = require('node-fetch')
We'll also make use of async/await. If your platform (like some versions of Node.js) doesn't support top-level async/await, you'll need to wrap the code in an async function. For example:
async function example() {
// Code here
let results = await apiCall()
// More code
}
With the setup out of the way, let's get started handling the response data. The results from the API call provide you with an object that contains some general metadata, as well as an array of repositories with the key of items. This lets you use a variety of techniques to iterate over the array and act upon the results. Let's look at some example use cases.
Many APIs, including GitHub's, allow you to sort the results by specific criteria. Rarely do you have full control over this. For example, GitHub's repository search only allows ordering by stars, forks, help-wanted-issues, and how recently an item was updated. If you need results in a different order, you'll have to build your own sorting function. Let's say you want to sort results by the number of open issues the repository has. This means the repository with the fewest issues should show first, and the repository with the most should show last.
You can achieve this by using Array.sort along with a custom sort function.
// Sort by open issues
const sortByOpenIssues = repos => repos.sort((a,b) => a.open_issues_count - b.open_issues_count)
// Run the apiCall function and assign the result to results
let results = await apiCall()
// Call sort on the items key in results
console.log(sortByOpenIssues(results.items))
To understand what's happening, lets look at how sort works. The method expects a specific return value:
The easiest way to work with these conditions is to subtract the second value from the first. So in our code above, you subtract b.open_issues_count from a.open_issues_count.
If the number of issues for "a" is greater, the result will be greater than 0. If they are equal, it will be zero. Finally, if b is greater the result will be a negative number.
The sort method handles all of the movement of items around for you, and returns a brand new array. In the example above, two values are compared, but you can use any calculation that results in the criteria mentioned above to sort the results of an array.
Sorting changed the order of our data, but filtering narrows the data down based on specific criteria. Think of it as removing all of a certain color of candy from a bowl. You can use Javascript's built-in filter method on arrays to handle this. Similar to sort, the filter method will iterate over each item and return a new array. Any Let's look at a few filter scenarios.
In the first, we'll create a filter that only shows repositories that contain a description.
// Filter only repos that have descriptions
const descriptionsOnly = (repos) => repos.filter((repo) => repo.description)
let results = await apiCall()
console.log(descriptionsOnly(results.items))
In this case, we're returning the truthiness of repo.description to represent whether the API returned a value or null. If the current iteration in the loop returns true, that iteration's item is pushed to the new array.
What if we want only repositories that have both a description and homepage URL? You can modify the previous example to achieve this.
// Filter only repos with URL and description
const homeAndDescription = repos => repos.filter(repo => repo.homepage && repo.description)
let results = await apiCall()
console.log(homeAndDescription(results.items))
Using Javascript's AND operator (&&), you can check that both the description and URL exist. If they do, the whole expression returns true and the item in the array is added to the new array. If either are false, the whole expression is false and the iteration will not be added to the new array.
What about something a little more complex? Lets say you want all repositories that have been updated after a certain date. You can do this by setting a threshold and comparing it to the updated_at value on each repository.
// Set a threshold
let date_threshold = Date.parse('2020-08-01')
// Filter over results and compare the updated date with the cutoff date
const filterByDate = (repos, cutoff_date) => repos.filter(repo => Date.parse(repo.updated_at) > date_threshold)
let results = await apiCall()
console.log(filterByDate(results.items, date_threshold))
Just as in the previous example, the truthiness of the returned value in the function passed to filter determines if the item is added to the new array.
Sometimes the data you receive isn't what you need for your use case. It can either include too much, or it can be in the wrong format. One way to get around this is by normalizing the data. Data normalization is the process of structuring data to fit a set of criteria.
For example, imagine these API interactions are happening on the server, but the client needs a subset of the data. You can re-shape the data before passing it down to the client.
const normalizeData = repos => repos.map(repo => ({
url: repo.html_url,
name: repo.name,
owner: repo.owner.login,
description: repo.description,
stars: repo.stargazers_count
})
let results = await apiCall()
console.log(normalizeData(results.items))
In the above code, the map array method is used to iterate over the results. It returns a new array made up of the values you return. In this instance, the data from the repos is simplified to include only a few key/value pairs, and the names of the keys have been made more digestible.
You can even use this time to modify any data. For example, you could wrap repo.stargazers_count in Number() to ensure the count was always a number and never a string.
Managing the data you receive from an API is a critical part of any API integration. Every API will return a different shape of data, in their own format. The exception being GraphQL APIs that give you more control over the shape, and sometimes the sort order and filtering options.
Whether you are using the data as part of a larger data-processing effort, or using it to improve the usefulness of your application to your users, you will need to perform some actions to make the data digestible for your app.
These API integrations are integral to your application, but what happens when they fail? We've written here before about some of the actions you can take to protect your integrations from failure.
Previously published at https://blog.bearer.sh/javascript-api-array-data-manipulation/