In C++, nothing prevents the programmer from including a header-file multiple times. This can cause a duplication of definitions, which is an error. Since it is difficult to ensure that a header-file is only included once, a common strategy is to make only the first include count. This can be done using an “include guard”, a small piece of preprocessor logic that looks like this:
On the first include, HEADER_HAS_BEEN_INCLUDED
is not defined, so we define foo
. On subsequent includes, HEADER_HAS_BEEN_INCLUDED
has been defined, so we just skip the content.
For example, if we have this C++ file:
Then it will expand to this:
And after the preprocessor has finished, we are left with this:
This is the idiomatic approach, but it has some limitations:
#endif
, which is located at the other end of the file to the #ifndef
#pragma once
was designed to overcome these issues. It is a non-standard, but widely supported, feature of C++ compilers. The concept is simple: any file containing #pragma once
will only actually be included once, even if the programmer includes it multiple times.
Using #pragma once
, our examples becomes:
Looks good, right? Sadly #pragma once
brings a host of problems.
The root cause is that #pragma once
is concerned with where some code lives, rather than its content. If you have two copies of the same file accessible via multiple paths, then it will get included twice. And, if you have two paths that appear different, but are actually the same, then the compiler may not spot this. To top things off, it is not standard, so compiler implementations do not have to respect its semantics.
The problems with #pragma once
stem from the fact that it works off of a file’s location, rather than its content. What if we just used the content instead?(Of course, recording all of the contents of each header would be slow, but we can optimize by recording a hash of the content instead).
The process would be:
1. When a header-file is included, hash it 2. If the hash has been seen before, then ignore the include 3. Otherwise, include the header as normal
This would be a robust solution because it is not at all concerned about the path a file is found at, only its content.
Adding a new command to the C++ standard would take considerable time, but luckily we can implement this logic using scripting and preprocessor.
The basic idea is this:
So, for example this header:
Has a SHA-256 hash of:
So the generated header might be:
Whilst the transformation for individual files is simple (Python script), we still need to manage the transformation process. We need to ensure that:
Using Buck build, we can encode this logic into a project’s build script easily.
Let’s start with a build-rule for a single file and then generalize:
A genrule
in Buck is much like a target in Make. We define the input files, the output file name and the command to execute. This target takes our Python script for generating an include guard and runs it on add.hpp
. Unlike Make, Buck will isolate and cache the process on its input hashes.
Now we have a single file working, we can generalize the process to n
files. To do this, we make a Python function that creates a genrule
for a given file:
To get the set of header files, we run a glob expression. For example:
And to bring it all together:
You can find a complete working example on GitHub.
Now, our header files can be written without include guards or #pragma once
:
This setup in Buck is really nice to work with:
#pragma once
)We created Buckaroo to make it easier to integrate C++ libraries. If you would like try it out, the best place to start is the documentation. You can browse the existing packages on Buckaroo.pm or request more over on the wishlist.
Approaches to C++ Dependency Management, or Why We Built Buckaroo_C++ is an unusual language in that it does not yet have a dominant package manager (we’re working on it!). As a result…_hackernoon.com