Appendix A. An Introduction to Preprocessor Metaprogramming
A.1. Motivation
Even with the full power of template metaprogramming and the Boost Metaprogramming Library at our disposal, some C++ coding jobs still require a great deal of boilerplate code repetition. We saw one example in , when we implemented tiny_size :
template struct tiny_size : mpl::int_<3> {};
Aside from the repeated pattern in the parameter list of the primary template above, there are three partial specializations below, which also follow a predictable pattern:
template struct tiny_size : mpl::int_<2> {}; template struct tiny_size : mpl::int_<1> {}; template <> struct tiny_size : mpl::int_<0> {};
In this case there is only a small amount of code with such a "mechanical" flavor, but had we been implementing large instead of tiny , there might easily have been a great deal more. When the number of instances of a pattern grows beyond two or three, writing them by hand tends to become error-prone. Perhaps more importantly, the code gets hard to read, because the important abstraction in the code is really the pattern, not the individual instances.
A.1.1. Code Generation
Rather than being written out by hand, mechanical-looking code should really be generated mechanically. Having written a program to spit out instances of the code pattern, a library author has two choices: She can either ship pre-generated source code files, or she can ship the generator itself. Either approach has drawbacks. If clients only get the generated source, they are stuck with whatever the library author generatedand experience shows that if they are happy with three instances of a pattern today, someone will need four tomorrow. If clients get the generator program, on the other hand, they also need the resources to execute it (e.g., interpreters), and they must integrate the generator into their build processes...
A.1.2. Enter the Preprocessor
...unless the generator is a preprocessor metaprogram. Though not designed for that purpose, the C and C++ preprocessors can be made to execute sophisticated programs during the preprocessing phase of compilation. Users can control the code generation process with preprocessor #define s in code or -D options on the compiler's command line, making build integration trivial. For example, we might parameterize the primary tiny_size template above as follows:
#include hpp> #ifndef TINY_MAX_SIZE # define TINY_MAX_SIZE 3 // default maximum size is 3 #endif template < BOOST_PP_ENUM_PARAMS(TINY_MAX_SIZE, class T) > struct tiny_size : mpl::int_ {};
To test the metaprogram, run your compiler in its "preprocessing" mode (usually the -E option), with the Boost root directory in your #include path. For instance:
[1] GCC's -P option inhibits the generation of source file and line number markers in preprocessed output.
g++ -P -E -Ipath/to/boost_1_32_0 -I. test.cpp
Given the appropriate metaprograms, users would be able to adjust not only the number of parameters to tiny_size , but the maximum size of the entire tiny implementation just by #define- ing TINY_MAX_SIZE .
The Boost Preprocessor library plays a role in preprocessor metaprogramming similar to the one played by the MPL in template metaprogramming: It supplies a framework of high-level components (like BOOST_PP_ENUM_PARAMS ) that make otherwise-painful metaprogramming jobs approachable. In this appendix we won't attempt to cover nitty-gritty details of how the preprocessor works, nor principles of preprocessor metaprogramming in general, nor even many details of how the Preprocessor library works. We will show you enough at a high level that you'll be able to use the library productively and learn the rest on your own.
A.2. Fundamental Abstractions of the Preprocessor
We began our discussion of template metaprogramming in by describing its metadata (potential template arguments) and metafunctions (class templates). On the basis of those two fundamental abstractions, we built up the entire picture of compile-time computation covered in the rest of this book. In this section we'll lay a similar foundation for the preprocessor metaprogrammer. Some of what we cover here may be a review for you, but it's important to identify the basic concepts before going into detail.
A.2.1. Preprocessing Tokens
The fundamental unit of data in the preprocessor is the preprocessing token . Preprocessing tokens correspond roughly to the tokens you're used to working with in C++, such as identifiers, operator symbols, and literals. Technically, there are some differences between preprocessing tokens and regular tokens (see section 2 of the C++ standard for details), but they can be ignored for the purposes of this discussion. In fact, we'll be using the terms interchangeably here.
A.2.2. Macros
Preprocessor macros come in two flavors. Object-like macros can be defined this way:
#define identifier replacement-list
where the identifier names the macro being defined, and replacement-list is a sequence of zero or more tokens. Where the identifier appears in subsequent program text, it is expanded by the preprocessor into its replacement list.
Function-like macros , which act as the "metafunctions of the preprocessing phase," are defined as follows:
#define identifier ( a 1, a 2, ... a n) replacement-list
where each ai is an identifier naming a macro parameter . When the macro name appears in subsequent program text followed by a suitable argument list, it is expanded into its replacement-list , except that each argument is substituted for the corresponding parameter where it appears in the replacement-list .
[2] We have omitted many details of how macro expansion works. We encourage you to take a few minutes to study section 16.3 of the C++ standard, which describes that process in straightforward terms.
A.2.3. Macro Arguments
Definition A macro argument is a nonempty sequence of: Preprocessing tokens other than commas or parentheses, and/or Preprocessing tokens surrounded by matched pairs of parentheses.
|
This definition has consequences for preprocessor metaprogramming that must not be underestimated. Note, first of all, that the following tokens have special status:
, ( )
As a result, a macro argument can never contain an unmatched parenthesis, or a comma that is not surrounded by matched parentheses. For example, both lines following the definition of FOO below are ill-formed:
#define FOO(X) X // unary identity macro FOO(,) // un-parenthesized comma or two empty arguments FOO()) // unmatched parenthesis or missing argument
Note also that the following tokens do not have special status; the preprocessor knows nothing about matched pairs of braces, brackets, or angle brackets:
{ } [ ] < >
As a result, these lines are also ill-formed:
FOO(std::pair) // two arguments FOO({ int x = 1, y = 2; return x+y; }) // two arguments
It is possible to pass either string of tokens above as part of a single macro argument, provided it is parenthesized:
FOO( ( std::pair ) ) // one argument FOO( ( { int x = 1, y = 2; return x+y; } ) ) // one argument
However, because of the special status of commas, it is impossible to strip parentheses from a macro argument without knowing the number of comma-separated token sequences it contains. If you are writing a macro that needs to be able to accept an argument containing a variable number of commas, your users will either have to parenthesize that argument and pass you the number of comma-separated token sequences as an additional argument, or they will have to encode the same information in one of the preprocessor data structures covered later in this appendix.