Simple persistence for distributed NodeJS apps As excited as a I’ve been about the mainstreaming of containerized applications, and container orchestration. I’m honestly even more excited about democratizing distributed systems to reach new audiences of developers. I’ve talked about this topic , but today I’m excited to discuss a new library for persistence in . several times Metaparticle/Storage implicit NodeJS Metaparticle/Storage makes persistence simple by making it implicit and automatic. Instead of explicitly making calls to storage infrastructure, you simply assign to local variables and the library takes care of the underlying storage management. Instead of worrying about parallel, replicated servers and data races, Metaparticle/Storage automatically handles conflict detection, rollback and resolution. Rather than dive into the details of the library, I thought instead I would walk you through the reasons for it’s existence and the problems it’s trying to solve. An example: a request counting server Throughout this discussion we’ll focus on a simple server that counts and reports the number of requests it receives. The simplest version of such a server can be written in a few lines of : Javascript // Simple HTTP Server example, keeps track of the number// of requests and reports back over HTTPvar http = require('http'); var count = 0;var server = http.createServer((request, response) => {count++;var suffix = (count == 1 ? ' request.' : ' requests.');response.end('There have been ' + count + suffix);}); server.listen(8090, (err) => {if (err) {console.log('error starting server', err)}console.log('server is listening on http://localhost:8090')}); This server is indeed quite simple, and it does keep track of the number of requests served, until the server crashes or needs to be restarted. Since the count is simply stored in memory, it is lost whenever the process dies. To preserve it, we need some sort of persistence. Adding Persistence with Redis To solve this, we’ll integrate the Redis key-value store into our application. There’s nothing magical about Redis, other storage layers are equally useful, and look similar in code. // Simple HTTP Server example, keeps track of the number// of requests and reports back over HTTPvar http = require('http');var redis = require('node-redis-client'); var count = 0; var host = process.env['REDIS_HOST'];var opts = {host: host};var client = new redis(opts);client.on('connect', function () {log.info('connected');}); var server = http.createServer((request, response) => {client.call('GET', 'count', function (err, res) {if (err) {console.log(err);return;}var count = 0;if (res != null) {count = parseInt(res);}count++;var suffix = (count == 1 ? ' request.' :  ' requests.');client.call('SET', 'count', '' + count, function (err) {if (err) {console.log(err);}response.end('There have been ' + count + suffix);});});}); server.listen(8090, (err) => {if (err) {console.log('error starting server', err)} console.log('server is listening on ');}); http://localhost:8090 There are several things to note from adding persistence into our server: First, the code has grown nearly 2x in size. Second, and more worryingly, the introduction of a persistent store means that our code now contains explicit function calls to things like and , it is no longer just implicitly manipulation data using the standard language (e.g. ). This means that persistent code looks different than “normal” code, which breaks the flow and introduces barriers to those who are just starting to learn to code. SET GET count = count + 1 Third, the asynchronous nature of these explicit calls not only makes the code longer, but harder to understand as well. Again, this is a barrier to developers becoming successful distributed system engineers. Problems from distributed, replicated servers Unfortunately, even with the persistence handled, our application is still not safe to scale out to multiple containers. To understand why this is the case, consider that there is a race between the read of a value and the subsequent write of that value. Consider what happens when two different instances of our application both read the same value, increment it by one and then both write the (same) new value back into the persistence layer. We will have serviced two user requests, but only ever incremented the count by one. One of the requests will have been lost. This is a classic example of a race. read-update-write Of course, we can solve this problem simply by adding additional code to our simple server to define an atomic transaction connecting the read and write of the data by the server. But again, this transaction adds complexity to the code, and reduces the number of developers who can successfully build such a system. (remember the goal at the top was to broaden and democratize the set of successful application developers). Using Metaparticle/Storage Rather than consider what the full, atomic, multiple-container safe server looks like. Let’s consider instead what this example looks like when implemented using Metaparticle/Storage: // Simple HTTP Server example, keeps track of the number of// requests and reports back over HTTPvar http = require('http');var mp = require(' '); @metaparticle/storage mp.setStorage('redis'); var server = http.createServer((request, response) => {mp.scoped('global', (scope) => {if (!scope.count) {scope.count = 0;}scope.count++;return scope.count;}).then((count) => {var suffix = (count == 1 ? ' request.' :  ' requests.');response.end("There have been " + count + suffix);});}); server.listen(8090, (err) => {if (err) {console.log('error starting server', err)} console.log('server is listening on ');} http://localhost:8090 You can note that this code is shorter than the explicit persistence example above. It is also easier to read (and to write) due to it’s implicit rather than explicit persistence. Yet the above code both uses Redis for persistence and ensures that multiple simultaneous reads and writes to storage will not corrupt the data. So what is actually going on? Implementation details It turns out that Javascript has some special features. One of the coolest is the ability to create shadow or proxy objects. These proxy objects look like real Javascript objects, but they proxy all get/set operations through a pre-defined proxy method. When you use to create a new data scope, it inherently returns a proxy object that intercepts all calls to read and write. These operations form the basis of the transaction which the system will apply or roll back. The Metaparticle/Storage library observes all of these calls to set new values, and automatically persists this new data into the persistence layer. This enables implicit persistence like to become intercepted and at the end of the scoped operation. Changes to the proxy object are persisted to storage. Additionally because the library controls access to storage, it is easy to detect, rollback and re-apply the same code multiple times to ensure a thread-safe concurrent variable update. metaparticle.scoped(scope, fn) i = i + 1 No free lunch! As with everything, of course, there is no free lunch. In order to make your code workable, the function in the block _must_ be idempotent. That is it can be called repeatedly with no additional side-effects. scoped Summary Well it’s time to wrap this up, this post is getting a little long, I hop you’ve found int interesting. This is really just the beginning of the journey, if you are intrested in helping with Metaparticle/Storage, the code is out there on today. Please come participate, file issues or otherwise contact me in the usual ways. github Best!! Brendan

Flow

Twitter

Too Long; Didn't Read

In your car, at home, or at work — Bosch technology shapes many areas of life.

Metaparticle/Storage

Too Long; Didn't Read

Company Mentioned

Coin Mentioned

Brendan Burns

About Author

TOPICS

THIS ARTICLE WAS FEATURED IN...

Metaparticle/Storage

Too Long; Didn't Read

Company Mentioned

Coin Mentioned

Brendan Burns

About Author

TOPICS

THIS ARTICLE WAS FEATURED IN...

RELATED STORIES