Sylver is a language agnostic platform for building custom source code analyzers (think eslint for every language). This might be a lot to unpack, so let us explore this tool by solving a real-world problem: our application's configuration is stored in complex JSON documents, and we'd like to build a tool to automatically validate these documents against our business rules. In this series of tutorials, we'll go from having zero knowledge of Sylver or static analysis to building a fully-fledged linter for our configuration files. We will use JSON as an example, but the tools and techniques presented apply to many data formats and even complete programming languages! Also, note that while we will be building everything from scratch using Sylver's domain-specific languages (DSL), a catalog of built-in specifications for the most common languages will be included in future releases of the tool. In part 1, we will discover Sylver's meta language, a DSL used to describe the shape
of programming languages and data formats. After completing this tutorial, we'll have
a complete spec for the JSON language, allowing us to turn JSON documents into
Sylver parse trees. In part 2, we will load the parse trees into Sylver's query engine,
and we will find nodes that violate our business rules using an SQL-like query
language. In part 3, we will learn how to turn our language spec and queries into a set of
linting rules so that we can run them conveniently and share them. Installation Sylver is distributed as a single static binary. Installing it is as simple as: Go to to download the binary for your platform https://sylver.dev Unpack the downloaded archive Move the binary to a location in your sylver $PATH Prelude Let us create a blank workspace for this tutorial!We'll start by creating a new folder and a file to write the Sylver specification for the JSON language. json.syl mkdir sylver_getting_started
cd sylver_getting_started
touch json.syl We will also store our test JSON file in . config.json {
   "variables": [
       {
           "name": "country",
           "description": "Customer's country of residence",
           "values": ["us", "fr", "it"]
       },
       {
           "name": "age",
           "description": "Cusomer's age",
           "type": "number"
       }
   ]
} This file specifies the variables used to describe customer profiles in a fictional customer database. Variables are assigned a type or a list of potential values. Types definition Sylver parses raw text into typed structures called parse trees. The first step in defining a language spec is to define a set of types for our tree nodes. We'll add the following node types declarations to our spec: json.syl node JsonNode { }

node Null: JsonNode { }

node Bool: JsonNode { }

node Number: JsonNode { }

node String: JsonNode { }

node Array: JsonNode { 
    elems: List<JsonNode> 
}

node Object: JsonNode {
    members: List<Member>
}

node Member: JsonNode {
    key: String,
    value: JsonNode
} These declarations resemble object-type declarations in many mainstream languages. The syntax denotes inheritance. : Now that we have a set of types to describe JSON documents, we need to specify how to build a
parse tree from a sequence of characters. This process is done in two steps: Lexical analysis: in this step, individual characters that form an indivisible entity
(such as the digits of a number or the characters of a string) are grouped together into tokens.
Some tokens are only one character-wide (for example, the brackets and semicolons in JSON). Syntactic analysis, in which tree nodes are built for the stream of tokens. Lexical analysis Tokens are described using declarations of the form where is either a literal surrounded by single-quotes ( ) or a regex between backticks ( ). The regexes use a syntax similar to Perl-style regular expressions. Characters in the input string that match one of the terminal literals or regexes will be grouped into a token of the given name. term NAME = <term_content> <term_content> ' ` term COMMA = ','
term COLON = ':'
term L_BRACE = '{'
term R_BRACE = '}'
term L_BRACKET = '['
term R_BRACKET = ']'
term NULL = 'null'

term BOOL_LIT = `true|false`
term NUMBER_LIT = `\-?(0|([1-9][0-9]*))(.[0-9]+)?((e|E)(\+|-)?[0-9]+)?`
term STRING_LIT = `"([^"\\]|(\\[\\/bnfrt"])|(\\u[a-fA-F0-9]{4}))*"`


ignore term WHITESPACE = `\s` Term rules for numbers and strings are slightly involved in accounting for some of JSON's peculiarities. Note that the term declaration (matching a single whitespace character) is prefixed with the keyword. This means that tokens do not affect the structure of the document and can be ignored during syntactic analysis. WHITESPACE ignore WHITESPACE Syntactic analysis In this last part of the language spec, we write rules describing how tree nodes are built by matching tokens from the input stream. For example, a rule specifying: "if the current token is a STRING_LIT, build a String node" can be written as follows: rule string = String { STRING_LIT } Rules can refer to other rules to construct nested nodes. For example, here is a rule specifying that a node (corresponding to an object member
in JSON) can be built by building a node using the rule and then matching a token followed by any valid JSON value: Member string COLON rule member = Member { key@string COLON value@main } Nested nodes are associated with a field using the syntax. @ The rule is the entry point for the parser, so in our case, it designates any valid JSON value. main A valid JSON document can be made of a 'null' literal, a number, a boolean value, a string, an array of JSON values, or a JSON object, which is reflected in the main rule: rule main =
    Null { NULL }
  | Number { NUMBER_LIT }
  | Bool { BOOL_LIT }
  | string
  | Array { L_BRACKET elems@sepBy(COMMA, main) R_BRACKET }
  | Object { L_BRACE members@sepBy(COMMA, member) R_BRACE } The syntax is used to parse nodes using the rule, while matching a token between every parsed node. sepBy(TOKEN, rule_name) main TOKEN Conclusion We now have a complete language spec for the JSON language: node JsonNode { }

node Null: JsonNode { }

node Bool: JsonNode { }

node Number: JsonNode { }

node String: JsonNode { }

node Array: JsonNode { 
    elems: List<JsonNode> 
}

node Object: JsonNode {
    members: List<Member>
}

node Member: JsonNode {
    key: String,
    value: JsonNode
}

term COMMA = ','
term COLON = ':'
term L_BRACE = '{'
term R_BRACE = '}'
term L_BRACKET = '['
term R_BRACKET = ']'
term NULL = 'null'

term BOOL_LIT = `true|false`
term NUMBER_LIT = `\-?(0|([1-9][0-9]*))(.[0-9]+)?((e|E)(\+|-)?[0-9]+)?`
term STRING_LIT = `"([^"\\]|(\\[\\/bnfrt"])|(\\u[a-fA-F0-9]{4}))*"`


ignore term WHITESPACE = `\s`

rule string = String { STRING_LIT }

rule member = Member { key@string COLON value@main }

rule main =
    Null { NULL }
  | Number { NUMBER_LIT }
  | Bool { BOOL_LIT }
  | string
  | Array { L_BRACKET elems@sepBy(COMMA, main) R_BRACKET }
  | Object { L_BRACE members@sepBy(COMMA, member) R_BRACE } The last step is to test it on our test file! This is done by invoking the following command: sylver parse --spec=json.syl --file=config.json Which yields the following parse tree: Object {
. ● members: List<Member> {
. . Member {
. . . ● key: String { "variables" }
. . . ● value: Array {
. . . . ● elems: List<JsonNode> {
. . . . . Object {
. . . . . . ● members: List<Member> {
. . . . . . . Member {
. . . . . . . . ● key: String { "name" }
. . . . . . . . ● value: String { "country" }
. . . . . . . }
. . . . . . . Member {
. . . . . . . . ● key: String { "description" }
. . . . . . . . ● value: String { "Customer's country of residence" }
. . . . . . . }
. . . . . . . Member {
. . . . . . . . ● key: String { "values" }
. . . . . . . . ● value: Array {
. . . . . . . . . ● elems: List<JsonNode> {
. . . . . . . . . . String { "us" }
. . . . . . . . . . String { "fr" }
. . . . . . . . . . String { "it" }
. . . . . . . . . }
. . . . . . . . }
. . . . . . . }
. . . . . . }
. . . . . }
. . . . . Object {
. . . . . . ● members: List<Member> {
. . . . . . . Member {
. . . . . . . . ● key: String { "name" }
. . . . . . . . ● value: String { "age" }
. . . . . . . }
. . . . . . . Member {
. . . . . . . . ● key: String { "description" }
. . . . . . . . ● value: String { "Customer's age" }
. . . . . . . }
. . . . . . . Member {
. . . . . . . . ● key: String { "type" }
. . . . . . . . ● value: String { "number" }
. . . . . . . }
. . . . . . }
. . . . . }
. . . . }
. . . }
. . }
. }
} In the next part, we'll define business rules to validate our JSON configuration (for example, the possible values for each variable must be of the same type), and we will use a query DSL to identify the tree nodes that violate these rules. Also Published Here

Get Started with Sylver - Part1 : Build a JSON Parser in 49 lines of Code

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

104 Stories To Learn About Go

105 Stories To Learn About Functional Programming

100+ Free Pluralsight Courses to learn Python, Java, and Spring Boot

10 Websites to Learn JavaScript for Beginners

104 Stories To Learn About Programming Top Story

10 Ways to Prevent and Manage Technical Debt—Tips from Developers

104 Stories To Learn About Go

105 Stories To Learn About Functional Programming

100+ Free Pluralsight Courses to learn Python, Java, and Spring Boot

10 Websites to Learn JavaScript for Beginners

104 Stories To Learn About Programming Top Story

10 Ways to Prevent and Manage Technical Debt—Tips from Developers

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps