Gandalf should have used arcsecond I recently starting making a serious attempt at learning Haskell (anyone who has tried before will probably sympathise that it usually takes a couple of tries to crack it). Amongst the many cool things Haskell has to offer is an amazing parsing library that comes with the standard set of packages called which lets you describe how to parse complex grammars in what essentially looks like natural language. Parsec , Here is how a CSV parser is implemented using Parsec. Don’t worry if you don’t understand all the syntax, the point is that the whole parser is specified in just four lines. This post is not about Haskell however, , with the goal of bringing that same expressivity to JavaScript. but rather a library I wrote called which is based on Parsec arcsecond Parser combinators is a parser combinator library, in which complex parsers can be built by composing simple parsers. The simplest parsers just match specific strings or characters: arcsecond These can then be combined together with the in the library. combinators Then the new parsers can be used with text: Combinators The combinators are where it gets cool. In a combinator is a higher order parser, which takes one or more parsers as its input and gives back a new parser that those in some way. If you’ve used in react like , , or , then you’re already familiar with the idea. arcsecond combines higher order components connect withRouter withStyles As shown above, is a combinator that will parse text using each of the parsers in order, collecting their results into an array. sequenceOf , on the other hand, will try each of its parsers in order and use the first one that matches. The library contains more, like , which takes a parser as an argument and matches as much as it can using that parser, collecting up the results in an array: choice many You can use to create a parser that matches items separated by something matched by . sepBy another parser will let you match items that occur between . between two other parsers Curried Functions makes use of curried functions. If you don’t know what a curried function is, give my article “ ” a read. If you really can’t be bothered right now, open that in another tab and read this executive summary. arcsecond Making Functional Programming Click A curried function is one that, if it takes more than one argument, it instead returns a new function that takes the next argument. It’s easier to see with some code: As you can see above, is first called with 1. Since it then returns a function, we can go ahead and call that with 2, which finally returns the actual result. curriedAdd We can use to create a new function by calling it with just one argument and then assigning the result to a variable. As a language, JavaScript treats functions as a first class citizen, meaning they can be passed around and assigned to variables. curriedAdd This principle lies at the heart of and every function in the library is defined this way. take two parsers — first a separator parser, and second a value parser. Because it is curried, it is easy to create a more specific combinator like by only supplying the first argument. arcsecond, sepBy commaSeparated If you’re not used to it, it will probably seem strange. But a good lesson to learn as a software developer is not to have knee-jerk bad reactions to things that you don’t immediately understand. There is usually a reason, and you’ll find a lot more value by discovering that reason rather than dismissing it. Error Handling If you try to parse a string which is not correctly formatted you would expect to get some kind of error message. uses a special data type called an , which is a value or an error. It’s like a , which can be either Rejected or Resolved, but without the implication of being asynchronous. In an however, the “resolved” type is called a , and the “rejected” type is called a . arcsecond Either either Promise Either Right Left The return type of is an Either. You can get at the value or error like this: parse However this might not fit well into your codebase if you don’t have more functional code. For that reason there are two other options. The first is to convert the into a Either Promise: Or you can use which must be wrapped in a try/catch block: toValue, Something more complex: JSON Let’s put through its paces by using it to create a parser for JSON. arcsecond Click here to skip ahead and see the full JSON parser in one file Values JSON only has 7 possible values: String Number true false null Array Object So to write a JSON parser, we just need to write parsers for all these values. Types In order for our parser to be useful we need to be able to identify what we’ve parsed, and the best way to do that is to put the results into a data type, which will provide a common interface for us to interact with the JSON tree. Every type has a name, a , and a function to pretty print the structure. type value toString With our types in hand let’s start with the absolute simplest of parsers: , and . These are just literal strings: true false null The parsers in have a method — just like arrays —which allows you to transform the value the parser matched. With we can put the values matched into the data types defined above. arcsecond map map Numbers are a bit more complex. The has this railroad diagram showing how a number can be parsed: JSON specification Credit: https://json.org The forks in the railroads show optionality, so the simplest number that can be matched is just 0. Basically a numbers like: 1 -0.2 3.42e2 -0.4352E-235 Are all valid in JSON, while something like is not, because no path in the railroad would allow for that. 03.46 If you take the time to read through the , you’ll see it lines up pretty much 1:1 with the diagram above. numberParser Let’s try strings next. A string is anything between double quotes, but it can also contain escaped quotes. The parser comes in very handy here, and is especially expressive when compared with the image on the JSON spec website. anythingExcept Credit: https://json.org That only leaves Array and Object, which can both be pitfalls because they are basically just containers of To illustrate how this might go wrong, we can write the Array parser the “wrong” way first and then see how to address it. jsonValue. We can use the parser — which matches zero or more whitespace characters — to ensure the the array brackets and comma operator allow for any (optional) whitespace that might be there. whitespace Because is defined in terms of , and is defined in terms of , we run into a . If we moved the definition of below , we’d still have the same problem. We can fix this by wrapping the in a special parser, aptly named . , which will allow us to reference variables that are not yet in scope. arrayParser jsonParser jsonParser arrayParser ReferenceError arrayParser jsonParser jsonParser recursiveParser The argument to is a thunk recursiveParser Implementing the is actually quite trivial — as simple as JSON values, separated by commas, in square brackets. arrayParser Object is only marginally more complex. The values in an object are pairs of strings some other JSON value, with a colon as a separator. And that’s it. The can be used to parse a JSON document in it’s entirety. jsonValue The full parser can be found here as a gist. Bonus Parser: CSV Since I opened up with a CSV parser in Haskell, let’s see how that would look in . I’ll keep it minimal and forgo creating data types to hold the values, and some extra strengthening that the Parsec version also doesn’t have. arcsecond The result should be an array of arrays — the outer array holds “lines” and the inner arrays contain the elements of the line. Conclusion There are quite a few key features of that didn’t get a mention in this article, including the fact it , and the parsing model is based on a . My main goal with this project was to bring the same level of expressivity that Parsec has to JavaScript, and I hope I’ve been able to do that. arcsecond can parse context sensitive languages Fantasy Land -compliant Monad , and think of the next time you find yourself writing an incomprehensible spaghetti regex — you might be using the wrong tool for the job! You can install the latest version with: Please check out the project on github along with all the API docs and examples arcsecond npm i arcsecond , and give this article a 👏 if you found it interesting! I might write a follow up on how works internally, so stay tuned for that. Hit me up on twitter @fstokesman arcsecond