paint-brush
Introducing Wrek — A Miniature Erlang Graph Engineby@ErlangSolutions
208 reads

Introducing Wrek — A Miniature Erlang Graph Engine

by Erlang SolutionsFebruary 26th, 2018
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

by <a href="https://github.com/rkallos" target="_blank"><strong>Richard Kallos</strong></a>

Company Mentioned

Mention Thumbnail

Coin Mentioned

Mention Thumbnail
featured image - Introducing Wrek — A Miniature Erlang Graph Engine
Erlang Solutions HackerNoon profile picture

by Richard Kallos

Wrek is an Erlang library I wrote for concurrently executing task dependency graphs. It’s intended purpose is to run a set of pre-defined tasks that have a partial ordering between them. In this post, I explain why I wrote wrek, what it can be used for, and how to use it.

Motivation

Wrek emerged as the result of two distinct forces. First, my amateurish enjoyment of graph theory made me try to see graphs wherever I could, laying the conceptual foundation for this library. Second, I realized that a project I was working on at Adgear would benefit from such a library, causing me to finally start writing Wrek.

Conceptual

The graph is a pervasive data structure in computing. When I learned about graph algorithms in school, I was amazed by their broad applications. Graphs play essential roles in compilers and build systems. Various graph algorithms form the backbone of how we communicate with each other over the internet.

Our everyday lives tend to be filled with lists. We have to-do lists, shopping lists, recipes, checklists, instructions, and much, much more. One day, I realized that some of these lists are deceiving. Some of these lists are actually graphs hiding in list’s clothing; these lists are topological orderings of directed acyclic graphs. Two of the more obvious cases of this are to-do lists and recipes. You can’t send a letter that you haven’t written yet, nor can you cook spaghetti if you haven’t boiled a pot of water.

Once the sneaky graphs were discovered, I spent some time thinking about how to benefit from treating various lists like the DAGs that they secretly are. The clearest analogies between these lists and DAGs were to-do lists and recipes. These very closely resembled dependency graphs. Unlike in their list representation, these dependency graphs show which vertices (individual tasks) can be executed concurrently. Sometimes this is obvious (e.g: I can chop vegetables while I wait for the pot of water to boil! I can begin preparing a second baking sheet of cookies while the first batch is in the oven!), but I had my hopes that the act of reformulating lists as dependency graphs could expose more opportunities for concurrency.

            Boil water -- Add pasta -- Cook pasta --.  
                                                     \\  





Purée tomatoes --. \\ \Chop vegetables -- Combine in saucepan -- Simmer sauce -- Combine pasta and sauce -- Serve/Add spices --'

Edges between vertices in the above graph express a partial ordering. Vertices with no path between them can be executed concurrently. The relationship between vertices in the graph reflects the truth in the kitchen. We can cook the pasta while our sauce is simmering. It doesn’t matter whether we chop vegetables or purée tomatoes first; they both need to be done if we’re going to make any sauce.

Despite my best efforts, I am still a disaster in the kitchen. As a means of changing this inconvenient fact, I thought it would be fun to try representing recipes as directed graphs, with extra data like how long each step can be expected to take. This sat in my list of ‘someday projects’ for a long time, and I was only reminded of it when I begin to think about a project I was working on at $JOB.

Practical

One of the projects I accomplished last year at Adgear was removing an expensive computation that was taking place on each of our already maxed-out edge servers, and replaced it with a system that performed the expensive computation on a single machine, then distributed the result. The system worked when nobody touched it, but it was the programming equivalent of twigs and duct tape; cron jobs and shell scripts. This system continued to be painful to use, and more examples of expensive computations being done on CPU-starved machines were cropping up. At this point, it made sense start writing a more robust system.

These expensive computations decomposed nicely into lists of steps to be performed. Fetch this data, transform it, transform it some more, send some of the data to this group of servers, send this other data to some other group of servers. Shell scripts are pretty good at encoding these pipelines, so it was an acceptable choice at the time of implementation. After a while, it dawned on me that these lists of steps were not really lists; they were dependency graphs. I tried to expose the latent concurrency in the shell scripts by spicing them up with some background jobs and waitpid , but it was decided that it would make more sense to switch to Erlang and benefit from all that OTP has to offer.

Let’s Get ‘Wrek’ ed

(I’m sorry. I couldn’t resist.)

Wrek accepts dependency graphs like the above spaghetti recipe as input, and executes each vertex. The nature of concurrently executing actions as soon as they are able to execute is generic to any problem that can be represented by a dependency graph. The structure of a dependency graph, as well as the tasks involved in each of the graph’s vertices are specific to the user’s wishes. Following the same generic/specific split as OTP; the generic portion of this class of problems is solved by wrek and wrek_vert modules. The specific portion is provided by the user in the form of an Erlang map describing each vertex in the graph, and a set of callback modules that implement the wrek_vert behaviour.

The wrek_vert behaviour consists of a single callback run/2, where the first argument is a list of arguments to be sent to the callback function, and the second argument is the ID of a process which can provide information generated by other vertices. The expected result of this callback function is either

{ok, Any :: any()} or {error, Reason :: any()}. If the callback function succeeds, Any will be taken by wrek and made available to other wrek_vert processes. If the callback function crashes or returns an error, the entire graph will be shut down.

Making Erlang Pasta

Following the contrived example above, let’s set about using wrek to make pasta. Of course, our program won’t really make pasta, but it’s output ought to fool somebody.

Looking at the above graph, each step seems to do one of three things:

  1. Add an ingredient
  2. Combine ingredients in a vessel
  3. Do something to ingredients in a vessel

Let’s go ahead write some wrek_verts for each of those actions. If you don’t feel like following along in a text editor, the full code can be found here.

-module(cook_add).


-behaviour(wrek_vert).-export([run/2]).



run([Ingredient, Quantity], _Pid) ->io:format(“adding ~s. amount: ~s.~n”, [Ingredient, Quantity]),{ok, #{added => [{Ingredient, Quantity}]}}.

That’s all for cook_add. It prints a message, then produces a map with a key added whose value is a proplist with a single pair.

-module(cook_heat).


-behaviour(wrek_vert).-export([run/2]).



run([Verb, Noun], _Pid) ->io:format("~ping ~p.~n", [Verb, Noun]),{ok, #{}}.

cook_heat is also pretty short. It’s also very abstract. It could be used to print a message about Verbing any Noun, not just cooking ingredients!

Our final callback module is a little longer, because it does a little bit more than printing a message.

-module(cook_combine).


-behaviour(wrek_vert).-export([run/2]).









run([Ingredients, Vessel], Pid) ->Fun = fun(Step, Acc) ->Stuff = wrek_vert:get(Pid, Step, added),io:format("combining ~p with ~p in ~p.~n", [Stuff, Acc, Vessel]),Stuff ++ Accend,Stuff = lists:foldl(Fun, [], Ingredients),io:format("~p now contains: ~p.~n", [Vessel, Stuff]),{ok, #{added => Stuff}}.

Ingredients is expected to be a list of vertex names. We finally use our parent process’s Pid as an argument to wrek_vert:get/3. This lets us consume the data produced by the cook_add callback module. After combining everything, we returns a new collection of ingredients.

Alright! We’re nearly done describing the specific portion of our problem! The last step is to represent our dependency graph in terms of these callback modules and the arguments we want to pass to them.

-module(wrek_example).

-export([make_pasta/0]).


















































make_pasta() ->application:ensure_all_started(wrek),Graph = #{tomatoes => #{module => cook_add,args => ["pureed tomatoes", "1 can"],deps => []},vegetables => #{module => cook_add,args => ["chopped vegetables", "lots"],deps => []},spices => #{module => cook_add,args => ["spices", "to taste"],deps => []},saucepan => #{module => cook_combine,args => [[tomatoes, vegetables, spices], saucepan],deps => [tomatoes, vegetables, spices]},simmer_sauce => #{module => cook_heat,args => [simmer, sauce],deps => [saucepan]},boil_water => #{module => cook_heat,args => [boil, water],deps => []},add_pasta => #{module => cook_add,args => ["pasta", "1 handful"],deps => [boil_water]},cook_pasta => #{module => cook_heat,args => [cook, pasta],deps => [add_pasta]},mix_pasta_with_sauce => #{module => cook_combine,args => [[saucepan, add_pasta], saucepan],deps => [simmer_sauce, cook_pasta]}},wrek:start(Graph).

That’s an eyeful! What we’ve done here is created an Erlang map whose keys represent the names of vertices in our original dependency graph, and whose values are maps that specify a callback module, arguments to pass to the callback module, as well as any dependencies the vertex may have. I congratulate those of you who noticed that we are never straining the pasta; I did confess to being a disaster in the kitchen. I promise I learned my lesson.

Alright, we’re done coding! Let’s start up a shell and make some pasta!

Read the rest of this blog post on codesync.global

Code BEAM SF 2018: 15–16 March 2018

Join Richard Kallos, developer at AdGear, at CodeBEAM SF where he will be giving a talk on wrek.

Originally published at codesync.global.