Just use a programming language. (Btw, I recently released Vidrio for Mac , Tony Stark’s presentation app. See https://vidr.io for more.) Every configuration file introduces a morass of unknown syntax, unknown semantics, poor debuggability, poor documentation, poor maintainability, insufficient abstraction, and insufficient generalization. But there are multiple alternatives to the configuration file, and they might work better. Compilation options move configuration from runtime to compile time. ‘Librification’ is a more general version of compilation flags. Runtime ‘configuration programs’ are a more general version of run-time configuration files. What is a configuration file? Conceptually, it’s an argument to a generalized service, which yields a specific program useful to you. For example: apache : Configuration -> HttpServer Apache isn’t an HTTP server; it’s a function that yields HTTP servers. Let’s say we own a website that sells magic beans. The configuration file might sit at and have content like /etc/apache2/magicbeans.conf Listen 80NameVirtualHost *:80 <VirtualHost *:80>DocumentRoot /var/www/magicbeansServerName …</VirtualHost> www.magicbeans.com When we want to launch our site, we would then run something like Apache would start, read the configuration file, see that it’s meant to serve files from when it receives requests for resources at . apachectl -f /etc/apache2/magicbeans.conf. /var/www/magicbeans http://www.magicbeans.com/ The Apache project chose to represent the argument as a set of files, the function is represented as a start-up phase which is given the path to those files to parse and compile, and the yielded is represented as the main phase of the program, which the start-up phase transitions to. Configuration apache HttpServer But in computing, we have many ways to represent functions and arguments, and this ‘start-up configuration file’ scheme is just one of them. I will outline the alternatives and suggest that they can be simpler and more powerful. 1. Compile-time configuration . These are set using a script called . These options overlap with the options that can be set using configuration files at run-time. Apache also has many compile-time options ./configure In this scheme, the is represented as arguments to a command, the function is the command, and the yielded is the compiled binary. Configuration ./configure make HttpServer Theoretically, it should be possible to move all apache files to the compile phase, so they are passed to the build system, which yields a binary which starts up immediately in the ‘main phase’ as an HTTP server. Rather than running at run-time, we would run . .conf apachectl -f /etc/apache2/magicbeans.conf ./configure -f /etc/apache2/magicbeans.conf -o magicbeans.exe && make && ./magicbeans One advantage of moving from run-time to compile-time configuration is simplicity: where the distinction between multiple configuration phases was a bit arbitrary, you now have just a single configuration phase. Another advantage is that the run-time binary does not have to carry around baggage for its configuration. Where the binary on every application server used to carry around logic for parsing configuration files, compiling configuration files, loading and unloading modules, etc., you now just have a single binary which does exactly what you want and no more. apache magicbeans This also means that configuring your application server is easier. Where you once had to ensure the presence of all your necessary .conf files on every application server, and that you restart Apache every time these change, you only have to ensure that the binary is present. magicbeans This shift from run-time to compile-time configuration also has a benefit similar to static typing. It’s better to catch errors in your configuration files at compile time than wait until you’re deploying. We could go further. If you squint, your static files under are more configuration files. The file is a configuration rule that says ‘when you get a request for or for , serve this content’. This rule could be bundled into your binary at compile time, too. With all these configuration options, the question is how ‘static’ you want things to be — bundling your static content into the binary makes it static, in the sense that you can no longer hot-swap that file without restarting the application. /var/www /var/www/index.htm /index.htm / more So why does Apache lean towards run-time configuration files? I suspect it’s historical. In times past, our application servers were multi-user UNIX systems running many different websites, where separation of users’ configuration was critical, and ‘graceful’ restarting of Apache was frequent. Nowadays, we run servers dedicated to single applications, where these concerns are less important. 2. Librification Apache is better viewed not as a but as a The file is its API. program library. apache2.conf This view reveals another approach to configuration. Instead of Apache being a program which reads your configuration file and then transmutes into your end-user program, Apache could be a function that you call in your end-user program. What I am suggesting is that you would instead write as /etc/mywebserver.c #include <apache2.h> int main(void){Apache2Config* conf = apache2_defaults(); Apache2VHost* vhost1 = apache2_vhost();vhost1->address = “*”;vhost1->port = “80";vhost1->document_root = “/www/example1";vhost1->server_name = “ ";apache2_add_vhost(conf, vhost1); www.example.com return apache2(conf);} and then use gcc to compile it, link it against Apache, and run it. Or rather, your compiles the configuration when you start the service. /etc/init.d/apache2 Substitute for Apache whatever service you wish to use, and substitute for C whatever language is used to implement that service. If using a Haskell web server library, you could instead write: module MyWebServer whereimport HttpServermain = http [defaultVHost { address = “*”, port = “80", documentRoot = “/www/example1" }] I’m talking about ‘compiled languages’, but of course the same applies to ‘interpreted languages’ (which just bundle the ‘compile’ step into the execution phase). 3. Configuration program 3.1. Prolog takes a really interesting approach to configuration. After a change request is submitted for review, Gerrit needs to know whether that change is allowed to be merged. Does the change need to be reviewed? Does the change require tests to pass? How many +1s does the change require? Can someone review+ their own change? Does a -1 cancel out a +1, or count as a veto? Do the rules differ depending on what branch you are merging to? Do some users have privileged rights? Gerrit Code Review One way they have allowed users to configure this is to try to split up all these questions into orthogonal parameters which administrators can provide values for. For example: could [branch=master]numberOfRequiredPositiveReviews=2negativeReviewIsVeto=truerequiresTestSuitePass=true… … and so on. They didn’t take this approach. Instead, they took a far simpler and more powerful approach. User-provided configuration is provided as a Prolog file, . In this file, the user effectively defines a function from ‘facts about the change’ to a boolean that says whether it is submittable. Those facts, such as ‘this change is authored by James’ or ‘Richard says +1’, are provided by Gerrit to the Prolog program whenever it needs to know whether a change is submittable. rules.pl So far, that’s the same as if they were to have taken the ‘configuration program’ approach, with the user providing a callback in their program. But since this is Prolog, Advantages You don’t have to learn the syntax rules for every new program you have to configure. And how many projects bother giving you a syntax definition for their config files? Next to none. There isn’t even a syntax definition for Apache config files. If you’re lucky, the program uses this month’s structured data syntax: all of which are obviously isomorphic. You already know the syntax. XML, JSON, YAML, TOML, … That little program that uses XML for its configuration file probably spends 99% of its program size on a bundled XML parser. The UNIX philosophy is that each program should do one thing, and do it well. The job of Apache is to parse HTTP requests, not configuration files. No programs bloated with parsers. Or at least, more easily defined, and more guessable. You don’t have to use trial and error to figure out what the scope of that variable is. You know the scoping rules for the programming language you’re using. More obvious semantics. If you want to find out if your addition to that configuration is being read by the program, just set a breakpoint. Debuggability. It’s a lot easier to use a standard documentation generator for your API than to write custom documentation for your ad hoc configuration file format. Tooling. The latest fashion in ‘devops’ is to use Puppet with fragile search-and-replace rules and templates to ensure the configuration files on your application servers are correct. Why? Just use a real programming language to construct the configuration data, compile the exact program you want, and put it on the application server. Manageable configuration. Want to run two different MySQL instances? No problem: write two programs and run them. By the way, those files under are global variables. I won’t repeat the sermons on global variables that you’ve heard already. Compositionality. /etc/mysql The power of a general-purpose programming language. If you want to define 1,000 near-identical virtual hosts, use a loop. If you want the program to behave differently on Debian, use an if/else. The thing is, all configuration file languages suffer from expressiveness creep through the lifetime of the project until they become a terrible, ad hoc, inconsistent programming language of their own. It might start off as a key-value syntax. Then someone wants to use the same value for several values, so . Then someone wants to change a variable, so a notion of assignment is introduced. Then someone wants to split the configuration file over several files, so . Then someone notices that variables defined in imported files clobber those in the parent file, so . Then someone wants a value to depend on which version of the software is being run, so , and a boolean expression language over versions. Then people want to branch based on more general properties of the environment, so . Abstraction. variables are introduced a notion of importation is introduced some kind of variable scope and shadowing is introduced a branching construct is introduced the expression language is expanded This is the most important advantage in my opinion, since this one really can’t be captured by a configuration file at all. Instead of giving me a million-and-one configuration options for URL rewriting with the limited power of regular expressions, just let me provide a function of type and be done with it. Instead of implementing lots of cute ways for me to define a predicate over filepaths, just let me provide a function of type , and be done with it. Configuration files are chock-a-block with configuration options which should just be replaced with higher-order functions. Generalized APIs. String -> String String -> Bool Programs that have seen the light . The used to configure it is just Ruby. Vagrant Vagrantfile . The file is just Haskell. XMonad xmonad.hs . The build system is a Haskell library, not a program. Shake . Mirage takes this idea to its logical conclusion, removing not just configuration and associated files, but the entire surrounding operating system. MirageOS … know of more? Let me know! Logback.xml config — conditionals in XML, ugh JSP??? _This does work. But I don't like it. The big problem is that these languages are too powerful. Didn't I just say that I…_ofb.net My Perfect Build System

Configuration files suck.

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps