Tuesday, May 20, 2008

Functional characterization of C++ template metaprogramming

C++ template metaprogramming is effectively a tiny programming sublanguage of a functional nature. Let us characterize it from the point of view of functional programming:

Purity. C++ TMP is pure, i.e. without side effects, except for one detail commented below. The unit of computation is template instantiation, which can only result in the definition of embedded types and numerical constants and further instantiations of other templates: none of these processes are dependent on the particular execution time they are triggered or the relative order of execution with respect to other computations. The only observable side effects are the potential warning messages issued by the compiler during the compilation process, but these do not affect the computations of the metaprogram itself.

Strictness. Strictness is associated with call-by-value semantics (predominant in imperative languages and many functional languages such as ML or Scheme), where non-strict or lazy languages do evaluate arguments only as needed (the main representative of this class being Haskell). C++ TMP can be implemented in either way, or even mixedly: the de facto standard library for C++ TMP, Boost.MPL, chooses call-by value as the default (most functions expect their arguments to be fully evaluated at the point of invocation), but some lazy constructs are provided as well. We enjoy this freedom of choice at the language level because computation (instantiation of some given type T) must be invoked explicitly (in Boost.MPL, by referring to the embedded T::type type): merely mentioning T does not cause T to instantiate.

Typing. In C++ TMP, the run-time objects of computation or values are C++ types. What are the types of C++ TMP then? It turns out there are no types in C++ TMP if by type we mean static sets of values along with compatible functions. A C++ TMP function is a class template dependent on n arguments like this:

template<Arg1,...Argn> struct F;
where each Argi is any of the following:

Non-type template parameters are seldom used in C++ TMP, and template template parameters are usually dropped in favor of equivalent representations based solely on type parameters (for instance, by preferring metafunction classes over metafunctions), so the language is virtually untyped. C++ TMP implements a form of dynamic duck typing: the program assumes that their arguments have some specific embedded types and values, and if this is not the case compilation (which is the run-time phase of C++ TMP) fails. For instance, the arithmetic metafunctions of Boost.MPL only work with types representing numerical values; in general types, type classes etc. only exist on the programmer's mind and are not enforced by the language. The absence of types in C++ TMP is a minor nuissance only, since static typing is mostly about anticipating run-time errors during the compilation phase, and C++ TMP has no actual compilation phase (compilation is C++ TMP run time). With no types in C++ TMP, questions about polymorphism in the language mostly do not apply, although a form of metafunction overloading can be exercised via partial specialization based pattern matching.

Pattern matching. This is one of the hallmarks of functional languages like Haskell. Pattern matching works over algebraic types whose structure is defined recursively from a set of syntactic constructors. It turns out that C++ TMP is not only able to implement pattern matching, in fact pattern matching is its basic computational device. Recursive template instantiations play the role of algebraic values, and their internal structure can be disentangled by means of partial template specialization. For instance:

‎struct zero;

‎template<typename N>
‎struct suc;

‎template<typename N,typename M>
‎struct sum;

‎template<typename M>
‎struct sum<zero,M>
‎{
‎ typedef M type;
‎};

‎template<typename N,typename M>
‎struct sum<suc<N>,M>
‎{
‎ typedef suc<typename sum<N,M>::type> type;
‎};‎

So, C++ template metaprogramming can be regarded as a pure, non-strict, untyped functional language with pattern matching.

2 comments :

Brandon said...

C++ is such a great language, despite being so complicated. Great article by the way. Can anyone give me an example of this technique being used?

Joaquín M López Muñoz said...

Can anyone give me an example of this technique being used?

Take a look at the Boost MPL Library.