Proof of Concept to ensure consistency between NPM packages and their source code --------------------------------------------------------------------------------- #### TL;DR; [_SNPM_](https://github.com/wilk/snpm) _is a Proof of Concept built to ensure consistency between what is published on the NPM registry and its open source counterpart on public repositories, like Github._  [Unsplash](https://unsplash.com/photos/IV--3UEiHlI) At the end of April, [**Node.js 10 was released**](https://medium.com/the-node-js-collection/the-node-js-project-introduces-latest-release-line-node-js-10-x-bf07abfa9076) and so NPM [**announced npm@6**](https://medium.com/npm-inc/announcing-npm-6-5d0b1799a905). One of the major feature introduced with version 6 is about [security](https://hackernoon.com/tagged/security): I’m talking about [**npm audit**](https://docs.npmjs.com/getting-started/running-a-security-audit). This new command allows the user to perform an _“assessment of package dependencies for security vulnerabilities helping the user to protect their package’s users from known vulnerabilities that could cause data loss, service outages, unauthorized access to sensitive information and so on”. _That’s a huge improvement that NPM team has done for the entire community. However, I think **they didn’t catch the real problem**. ### A major security leak At the beginning of 2018 [**@david.gilbertson**](https://hackernoon.com/@david.gilbertson) was publishing what would become [**a stunning article**](https://hackernoon.com/im-harvesting-credit-card-numbers-and-passwords-from-your-site-here-s-how-9a8cb347c5b5) about the fragility of NPM ecosystem security. What David did then was **exposing a** **major security leak** of NPM model. > In few words: what is published on the NPM registry may differ from what you see on Github. Usually, a NPM package has its source code published on Github (or hosted on another public repository), giving users the ability to verify the project and contribute. However, when someone publishes a new package on NPM no one can say that what has been uploaded is exactly what has been published on Github. This means that you can upload a malicious code on NPM and then publish a different harmless code on Github, and users won’t realize it. Not immediately, at least. > This weakens the foundation of the Open Source ~ Paolo Sinelli How can we trust a system that allows users to share closed source code without any way to verify its authenticity? Funny fact: in these very days, [a backdoor has been spotted on a Python package named ssh-decorator](https://amp.reddit.com/r/Python/comments/8hvzja/backdoor_in_sshdecorator_package/?st=jgynd5qc&sh=81855e72&__twitter_impression=true). Someone found it inside the source code and alerted users so the compromised package was finally taken down. This is the proof that having access to the source code allows the community to prevent and cure bad situations. ### SNPM for the rescue So, After David’s article, I started creating a [**Proof of Concept**](https://github.com/wilk/snpm) to avoid these situations with NPM. It is called **Secure NPM (SNPM)** and I’ll try to explain how it works in few simple steps. **_Disclaimer_**_: SNPM is calibrated for compiled (binaries) and transpiled packages but it can be adapted also for plain packages._ First, when the author wants to publish a new version of their package, they **build the sources** and **calculates the checksum** of the compiled output.  Build sources Then, they need to **update the package.json** with the new version and the checksum. [CommonJS provides a field](http://wiki.commonjs.org/wiki/Packages/1.1#Catalog_Properties) called **checksums** where the author can store checksums of its code.  Update package.json Now they can **release** it on Github (or another public repo) with a new tag.  Release on a public repository At this point, they can **publish** it on NPM via SNPM: it will send just few data, such as the package version, the public repository url and the checksum. This differ from “_npm publish”_ command because it won’t send the whole source code, nor the compiled files, but just those three information above.  Publish on NPM, using SNPM NPM is now in charge of **fetching the released version** from the public repo.  Fetch released version tar.gz NPM can now **validate** the checksum by performing a series of actions: 1. installing all the dependencies, via **npm install** 2. building the sources, via **npm run build** 3. **calculating the checksum** of the compiled files 4. **verifying the checksums**  Reproduce and verify the checksum At this point, NPM has all the information to **reply** with a successful answer or with an error.  Reply to the publish request That’s it. Following the summarized algorithm:  SNPM algorithm The main difference between NPM and SNPM is that (compiled) sources are not uploaded any more from the (untrusted) author but downloaded directly from the public repo (trusted), where everyone can reproduce the same procedure and verify the checksum. ### Conclusions Unfortunately, _npm audit_ doesn’t solve this major leak and the issue is still open, especially for binaries and minified [Javascript](https://hackernoon.com/tagged/javascript). We (as a community) must get verified and validated packages. Anyway, I want to thank the NPM team for their awesome work and I hope this article will push them to improve the security of one of the best package manager ever developed. #### \*\*\* Update \*\*\* I’ve created [**a new issue**](https://github.com/npm/npm/issues/20640) on NPM’s repository, asking for a Feature Request. After that, I’ve added [**another proposal**](https://github.com/npm/npm/issues/20640#issuecomment-389762201) always on the same thread called **npm verify**: as a NPM command, it should reproduce the download-build-checksum verification process, so users can verify packages on the fly. #### \*\*\* Update — July \*\*\* I’ve submitted a [**new RFC for NPM**](https://github.com/npm/rfcs/pull/16) and [**one for Yarn**](https://github.com/yarnpkg/rfcs/pull/94).