In C++, nothing prevents the programmer from including a header-file multiple times. This can cause a duplication of definitions, which is an error. Since it is difficult to ensure that a header-file is only included once, a common strategy is to make only the first include count. This can be done using an “include guard”, a small piece of preprocessor logic that looks like this:
How Does it Work?
On the first include,
HEADER_HAS_BEEN_INCLUDED is not defined, so we define
foo. On subsequent includes,
HEADER_HAS_BEEN_INCLUDED has been defined, so we just skip the content.
For example, if we have this C++ file:
Then it will expand to this:
And after the preprocessor has finished, we are left with this:
This is the idiomatic approach, but it has some limitations:
- Three lines of boiler-plate code are required
- The variable name on lines 1 and 2 must match exactly
- The same variable name must not be used in multiple files
- We have to remember the
#endif, which is located at the other end of the file to the
What About pragma once?
#pragma once was designed to overcome these issues. It is a non-standard, but widely supported, feature of C++ compilers. The concept is simple: any file containing
#pragma once will only actually be included once, even if the programmer includes it multiple times.
#pragma once, our examples becomes:
Looks good, right? Sadly
#pragma once brings a host of problems.
The root cause is that
#pragma once is concerned with where some code lives, rather than its content. If you have two copies of the same file accessible via multiple paths, then it will get included twice. And, if you have two paths that appear different, but are actually the same, then the compiler may not spot this. To top things off, it is not standard, so compiler implementations do not have to respect its semantics.
A Possible Workaround
The problems with
#pragma once stem from the fact that it works off of a file’s location, rather than its content. What if we just used the content instead?(Of course, recording all of the contents of each header would be slow, but we can optimize by recording a hash of the content instead).
The process would be:
1. When a header-file is included, hash it
2. If the hash has been seen before, then ignore the include
3. Otherwise, include the header as normal
This would be a robust solution because it is not at all concerned about the path a file is found at, only its content.
Implementing the Workaround
Adding a new command to the C++ standard would take considerable time, but luckily we can implement this logic using scripting and preprocessor.
The basic idea is this:
So, for example this header:
Has a SHA-256 hash of:
So the generated header might be:
Whilst the transformation for individual files is simple (Python script), we still need to manage the transformation process. We need to ensure that:
- The transformation is run for every file
- New files are automatically transformed
- The transformations of deleted files are automatically removed
- The transformation is only re-run when the file has changed
- Bonus: Transformations can be safely put into a shared network cache
Using Buck build, we can encode this logic into a project’s build script easily.
Let’s start with a build-rule for a single file and then generalize:
genrule in Buck is much like a target in Make. We define the input files, the output file name and the command to execute. This target takes our Python script for generating an include guard and runs it on
add.hpp. Unlike Make, Buck will isolate and cache the process on its input hashes.
Now we have a single file working, we can generalize the process to
n files. To do this, we make a Python function that creates a
genrule for a given file:
To get the set of header files, we run a glob expression. For example:
And to bring it all together:
You can find a complete working example on GitHub.
Now, our header files can be written without include guards or
This setup in Buck is really nice to work with:
- Zero boiler-plate in the header files
- Buck will automatically check for new header files, so that builds are always up-to-date
- Buck will remove stale generated headers
- Because it understands the target graph, Buck will generate headers in parallel
- Buck will cache generated headers so that they are only computed when required
- We are no longer relying on human accuracy (include guards) or non-standard features (
Since You’re Here…
We created Buckaroo to make it easier to integrate C++ libraries. If you would like try it out, the best place to start is the documentation. You can browse the existing packages on Buckaroo.pm or request more over on the wishlist.