

As an Ember developer, discovering Yarn package manager by Facebook was the best that could happen to me.
Whenever one of my fellow developers encountered βBroccoli Plugin failedβ¦β I used to say βrm -rf node_modulesβ, thatβs what my seniors told me and I continued the tradition.
That was very frustrating and as a beginner, I use to blame it all on Ember as I didnβt understand much of what was happening. As time passed I realized npm was the culprit. But a few months back I discovered this Yarn package manager. When I found it, I read that it promised to solve most frequent βnpm installβ errors. It claimed it was better because it was:
Β· Deterministic unlike npm.
Β· Network performance by parallel installations.
Β· Same package installation like npm (it uses the same registry).
Β· Yarn uses check-sums to verify the integrity of packages.
Β· Works even Offline for previously installed packages as it caches all the packages.
Its has gained a lot traction in recent times so its features can be found all over the internet. What you wonβt find that easily on internet is how does it achieve these great features. I initially didnβt pay much attention to how did npm installations worked but as I got to learn about the yarn I got curious, so I decided to dive deep and find how did they work. Hope it helps beginners understand things better.
As a beginner in npm world, I wasnβt aware of few of following terms so I decided to add them here so that the newcomers wonβt have to google every word and then jump back and forth.
They are a piece of software that can be downloaded from the internet to accomplish some task. They may depend on other packages.
They specify the packages your application uses. They are specified inside manifest files like package.jsonΒ . Following dependencies might be represented as below:
βdependenciesβ: {
Β βaβΒ : β1.0.0β,
Β βbβΒ : β1.0.0β
Β }
They use semantic versioning i.e.
i. Given a version number MAJOR.MINOR.PATCH.
ii. MAJOR version when you make incompatible API changes,
iii. MINOR version when you add functionality in a backwards-compatible manner, and.
iv. PATCH version when you make backwards-compatible bug fixes.
They follow flat namespace, meaning all the different packages are given same hierarchy unless they have a multiple versions, then it that case theyβll be assigned lower level in the namespace like for βs^2.0.0β in the above example.
So lets start with an example of npm installation and then followed by yarn installation. Iβll keep the example simple but highly detailed.
i. This will be our first time installation of package.json
ii. There is no node_modules folder at the moment.
iii. No pre-cached packages.
i. Dependency resolution
Make requests to the registry and look up dependencies recursively and determine the location of package installation in the dependency folder.
ii. Fetch packages
Fetch packages in compressed format and place them in global cache.
iii. Linking Packages
Copying files from global cache to local node_modules folder.
That said lets have a look at how do above things actually take place when we do βnpm installβ.
Now consider a manifest file(package.json) having following simple hypothetical packages.
βdependenciesβ: {
Β βaβΒ : β1.0.0β,
Β βbβΒ : β1.0.0β
Β }
Weβll make it more simpler by removing the minor and patch versions and just keep the major versions. That leaves us with following:
When we do βnpm installβ for the above package.json file, npm does followingΒ :
a. If node_modules folder is already present (i.e. you had previously used βnpm installβ which in turn creates npm_modules folder), then uses that to make ideal tree by loading from disc.
b. Clone the existing tree to build ideal tree, which serve as our final tree for the app.
c. Using the cloned tree build ideal tree, which will be used to build package folders in node_modules folder. This can be seen in the below snapshot.
First we encounter βa1β package which has another dependeny of βs1β, as this is the first time we saw it so we follow the flat structure and put it as the common dependency along βa1β and βb1β. This structure allows us to reuse this package for another package which maybe dependent on βs1β.
Now when we encounter βs2β which is version 2 of same package βsβ then we cannot follow flat structure as same packages cannot reside in the same directory. So we consider it as a child for βb1β and a folder for βs2β is created inside βb1β.Β
Β
Β This also tells us that when we need to reuse a package, we need to have it at the top of another package with same name but different version, which may not be reused by other packages.
a. Based on the ideal tree npm detects which packages need to be installed and which are already installed. For our case as there is no ideal, as we started installation from scratch.
b. It also resolves the versions used in sub packages and in the flat directory path.
c. At this step, npm knows which packages needs to be installed in which folders.
a. Here, npm actually fetches the packages from npm registry and installs them in the respective folders for packages.
Now, thatβs how npm did it from ages, now lets see what does our new guy Yarn has to do to make things better.
a. Yarn creates a list of package requests it needs to make for the manifest file. It first checks for the package and then it adds the required version to the list. Dependency resolution is done package by package. For this example, it considers βa1β.
b. Now that we have package name and its version number, Yarn goes on to npm repository (Registry) and finds for the same or higher version of the package as mentioned in the manifest file. We can mention in package.json or manifest file about the package that needs to be fetched. Following are the symbols used to indicate what version needs to be fetched.
i. β~βΒ : It fixes both the major and minor version numbers, while matching any patch number. Ex. ~2.1.0 means that fetch anything higher than 2.1.0 but less than 2.2.0.
ii. β^βΒ : It locks the major version and looks for the latest version for minor and patch versions. Ex ^2.1.0 means fetch anything higher than 2.1.0 and less than 3.0.0
iii. β*βΒ : It locks MAJOR or MINOR depending on where its used. Ex 2.* means anything less than version 3.0.0 and 2.0.* means anything less than 2.1.0
c. Now Yarn checks the local cache before making actual request to registry. If the package that needs to installed is already used by some other package then there is no need to request it again as it was already cached by yarn for later reuse.
d. The above process is carried out for evry package and then has the list of packages that needs to fetched as it knows which were previously fetched and are cached.
a. Now the request for the package is made and is fetched from registry. In same fashion, Yarn fetches other packages.
Once all the required packages are fetched Yarn does the job of linking those packages as βnpmβ did. But here linking is a bit complex as Yarn needs to link both the cached and newly fetched packages. It needs to copy the cached packages to the new path for linking.
a. It is used to store exactly which versions of each dependency were installedΒ . Itβs similar to npmβs npm-shrinkwrap.json, however itβs not lossy and it creates reproducible results. In Rails, you have bundler for doing similar thing. It is managed by Yarn and is updated every time you add/remove/update the existing packages.
b. This file always needs to be committed as it is the sole reason Yarn is able to maintain integrity and consistency of packages across the systems. So it is very impostant who commits this file and what is committed with this file. Broken package commit might lead to installation of same broken package on every team memberβs machine.
c. It also sues Checksums for every package which helps ensure that the integrity of your data is maintained.
Now that we know how Yarn worksΒ , I would like to demonstrate one of main shortcomings of npm and show you how yarn solves it.
I would like to show how installing same packages on different machines results in different node_modules folder in npm and how yarn solves this problem by maintaining a lock file.
Now suppose, Yehuda installs the package as follows using npm
Consider he decides to upgrade Package βaβ to βa2βΒ . When βnpm installβ is run it produces following structure.
Now suppose Yehuda wants to share his code with the world and people start using the βnpm installβ for the above package.
So as we can see that npm doesnβt maintain consistent structure across systems but Yarn was built to address this sole issue. Rest of the features were just an addons.
This can be solved by clearing βnode_modulesβ folder but thatβs not the most heavenly way of doing itΒ . In fact reinstalling the packages again is gonna take ages.
Shrinkwrap to rescueβ¦.. but where do I find it???
Its already there with npm but its disabled by default and is lossy. So even npm guys donβt trust it for you.
Now Yehuda discovers Yarn, so he decides to give it a try.
Also there is lock file which maintains this dependency hierarchy even when you run βyarn installβΒ . The result is consistent on every machine.
Now who ever does βyarn installβΒ , he gets the same dependencies and in exact same order, as lock file is transmitted to everyone.
Pro TipΒ : Never forget to commit your lock file as it serves as the single source of truth to all the users.
We know Yahuda is really smart so he decides to make integration between Ember and Yarn tighter.
Ember 2.13 is now Yarn aware meaning it generates lock file and encourages you to use Yarn.
Create your free account to unlock your custom reading experience.