How'd you change the behavior of a software project based on a set of parameters?
You could use environment variables. But what if you want complex structures in the parameters?
Can you use JSON? Yes, you can. JSON files don't allow comments. How'd you describe your parameters to readers?
The answer is to use a TOML configuration file.
Config files are great ways to extract the project parameters. When you share your code with others, they know exactly where to make changes to tweak the behavior of the software.
For instance, say yours is a website codebase that supports several themes. Others who set up your code can go to the config file and change the theme variable to a different one. It's more convenient than going to the codebase and editing the theme in various files.
But to let your users know what themes are available, you need to have comments on the config files. You can say your website supports 'dark', 'light,' 'pop,' and 'grayscale' themes. JSON files fall short, as they don't allow comments.
It's a misconception that environment files are config files. They are not.
While it's also possible to store values in the environment file, its purpose is to hide secrets from other developers.
Say you have database credentials and API keys. Publishing your code with these values to a public cloud like GitHub can be harmful. Hence we use env variables to separate them.
Config files are to store values that you can happily share with other developers. The themes option we discussed is a good example.
TOML is an excellent option because it's both super straightforward, and most popular platforms accept it.
For instance, the popular headless CMS platform Netlify uses TOML files to upload build configurations. We can configure the node environment, the build command, the output directory, etc., in this config file.
Another good example is the GitLab Runner. Gitlab runner is part of the Gitlab CI/CD pipeline. It allows you to run jobs in a pipeline during continuous deployment. How do you configure it? You use a TOML file.
In a TOML file, you can tell the runner how many concurrent jobs to execute, the log level, listening port, and many other options.
TOML also files also have syntax highlighting support from many code editors. For instance, on VSCode, I've installed the "Even Better TOML" extension.
TOML file is a great feature of a modern Python project structure.
TOML (Tom's Obvious Minimal Language) is a configuration file format that is easy to read and write. It's minimal, and even people with no programming experience can easily understand it.
TOML supports many data structures, such as key-value pairs, arrays, and tables.
Parser libraries can convert them to native data structures in different programming languages.
Most mainstream programming languages can parse it. As of this post, over 40 languages have TOML parsers. This list includes JavaScript, Java, C#, PHP, and C++.
This post focuses on TOML file usage in Python.
A TOML file can live anywhere in a project. The ideal location is the project's root folder because we want to edit the configuration for the project. If we only care about a module, we can move it to a subfolder.
TOML files have an extension, .toml. Here's an example config file.
# This is a TOML document
title = "TOML Example"
[owner]
name = "Tom Preston-Werner"
dob = 1979-05-27T07:32:00-08:00
[database]
enabled = true
ports = [ 8000, 8001, 8002 ]
data = [ ["delta", "phi"], [3.14] ]
temp_targets = { cpu = 79.5, case = 72.0 }
[servers]
[servers.alpha]
ip = "10.0.0.1"
role = "frontend"
[servers.beta]
ip = "10.0.0.2"
role = "backend"
There's a lot to learn from this example.
First, the file starts with a comment. You can create a comment anywhere in the config file by adding a # sign.
Next is value assignment. It's pretty straightforward, as you can see. You can name a key and assign a value to it using the = sign.
The values can be of many types. In the example, the title is assigned to a string. But you can see the DOB is a DateTime. The Ports key is a list, data is a list of lists, and targets are another set of key-value pairs.
An important aspect of TOML file is the headers([owners], [database], etc) . In TOML, these headers are called tables. It depicts a hashtable which is a collection of key-value pairs.
Parser libraries will use this information to create high-level keys in their native types. More on this later.
Tables can also be multi-level. As you can see in the above example, alpha and beta are two hashtables created inside the server hashtable.
Python parser libraries will convert these into nested dictionaries.
TOML supports many other data types that parser libraries convert to native varieties of the programming language.
We're going to use a Python TOML parser library called tomli.
The upcoming Python version, 3.11, has a standard library called tomlib. This module helps read TOML files from a string.
Yet, for all other Python versions (3.6+), we can use an external package called tomli. We can install tomli from PyPI repository as follows.
pip install tomli
First, we need to import the module. Thankfully, both the standard module in 3.11 and the external package follow the same API. Thus we can import them conditionally as follows:
try:
import tomllib
except ModuleNotFoundError:
import tomli as tomllib
Now, we can convert any TOML string into Python native variables.
toml_str = """
# This is a TOML document
title = "TOML Example"
[owner]
name = "Tom Preston-Werner"
dob = 1979-05-27T07:32:00-08:00
[database]
enabled = true
ports = [ 8000, 8001, 8002 ]
data = [ ["delta", "phi"], [3.14] ]
temp_targets = { cpu = 79.5, case = 72.0 }
[servers]
[servers.alpha]
ip = "10.0.0.1"
role = "frontend"
[servers.beta]
ip = "10.0.0.2"
role = "backend"
"""
data = tomllib.loads(toml_str)
data
# {'title': 'TOML Example', 'owner': {'name': 'Tom Preston-Werner', 'dob': datetime.datetime(1979, 5, 27, 7, 32, tzinfo=datetime.timezone(datetime.timedelta(days=-1, seconds=57600)))}, 'database': {'enabled': True, 'ports': [8000, 8001, 8002], 'data': [['delta', 'phi'], [3.14]], 'temp_targets': {'cpu': 79.5, 'case': 72.0}}, 'servers': {'alpha': {'ip': '10.0.0.1', 'role': 'frontend'}, 'beta': {'ip': '10.0.0.2', 'role': 'backend'}}}
In the converted example, we can see our hashtables (or the headers) are the top level-keys. Secondary headings are converted to nested dictionaries.
We can also see the parser library did an amazing job converting TOML configurations to native data types.
We can also read TOM files from a file. We need to open it as a binary file and load it with tomllib.
import tomli as tomllib
with open("./config.toml", "rb") as f:
data = tomllib.load(f)
data
# {'title': 'TOML Example', 'owner': {'name': 'Tom Preston-Werner', 'dob': datetime.datetime(1979, 5, 27, 7, 32, tzinfo=datetime.timezone(datetime.timedelta(days=-1, seconds=57600)))}, 'database': {'enabled': True, 'ports': [8000, 8001, 8002], 'data': [['delta', 'phi'], [3.14]], 'temp_targets': {'cpu': 79.5, 'case': 72.0}}, 'servers': {'alpha': {'ip': '10.0.0.1', 'role': 'frontend'}, 'beta': {'ip': '10.0.0.2', 'role': 'backend'}}}
YAML is another widely-used, feature-rich config file format. Both file formats have pretty straightforward syntax. Yet, the notable difference in the YAML file is the significance of whitespaces.
In YAML, you specify a list by typing the list items in separate lines with an indentation. You can create nested lists with more indentation.
This can confuse a non-technical reader.
In TOML, however, whitespace plays no role. You create lists with a square bracket.
If you think your config file users are somewhat technical, you can choose one based on other factors. For instance, you can think of the platform on which you will deploy your project. If it's Netlify or Gitlab, you can choose a TOML file.
Abstracting config parameters out transforms a project from good to great. Whenever you need a new deployment, you change the configurations in one place, and your app responds.
A configuration file is what you need to do this. This post introduced TOML files and why they are fantastic. You could also use the environment file, a JSON file, or any text document. But TOML files have significant advantages over these other formats.
We've discussed using the Python library tomli to read from TOML files. Going forward, we might not need an external library to work with TOML. Python 3.11 ships with a standard module called tomllib.
Lastly, we discussed how we compare TOML with YAML. YAML is also a popular library to store configurations. While YAML can benefit from complex configuration options, TOML is more beginner friendly.
Also published here.