Tuesday, June 17, 2008

C preprocessor bignums

Consider the following Boost.Preprocessor-based snippet:

‎// ARRAY defines a stack-based array or a dynamically
‎// allocated array when the size goes beyond some threshold

‎#define ARRAY(name,type,n) \
‎BOOST_PP_IF( \
‎ n<=128, \
‎ type name[n], \
‎ type* name=(type *)malloc(n*sizeof(type)))

‎ARRAY(x1,char,64);
‎ARRAY(x2,char,200);
This does not produce the desired result, but instead expands to
‎BOOST_PP_IIF_1<=128(char x1[64], char* x1=(char *)malloc(64
‎*sizeof(char)));
‎BOOST_PP_IIF_1<=128(char x2[200], char* x2=(char *)malloc(2
‎00*sizeof(char)));

which does not seem to make much sense. What happened? The error comes from the fact that the condition in the BOOST_PP_IF clause must evaluate to 0 or 1 in the preprocessor realm: the text 64<=128, for instance, is a compile-time constant in C, but certainly does not coincide with the preprocessor tokens 0 or 1.

To remedy this, Boost.Preprocessor provides arithmetic macros that perform numerical manipulation at the preprocessor level, so that we can write:

‎‎#define ARRAY(name,type,n) \
‎BOOST_PP_IF( \
‎ BOOST_PP_LESS_EQUAL(n,128), \
‎ type name[n], \
‎ type* name=(type *)malloc(n*sizeof(type)))

‎ARRAY(x1,char,64);
‎ARRAY(x2,char,200);
which expands to
‎char x1[64];
‎char* x2=(char *)malloc(200*sizeof(char));
as intended. But now, if we decide to set the threshold in ARRAY to 1024 we get an error again:
‎‎‎BOOST_PP_IIF_BOOST_PP_BOOL_BOOST_PP_COMPL_BOOST_PP_BOOL_BOO
‎ST_PP_TUPLE_ELEM_2_0BOOST_PP_IIF_BOOST_PP_BOOL_1024(BOOST_P
‎P_WHILE_2, (64, 1024) BOOST_PP_TUPLE_EAT_3)(BOOST_PP_SUB_P,
‎BOOST_PP_SUB_O, BOOST_PP_IIF_BOOST_PP_BOOL_1024(BOOST_PP_SU
‎B_O, BOOST_PP_NIL BOOST_PP_TUPLE_EAT_2)(2, (64, 1024)))(cha
‎r x1[64], char* x1=(char *)malloc(64*sizeof(char)));
‎BOOST_PP_IIF_BOOST_PP_BOOL_BOOST_PP_COMPL_BOOST_PP_BOOL_BOO
‎ST_PP_TUPLE_ELEM_2_0BOOST_PP_IIF_BOOST_PP_BOOL_1024(BOOST_P
‎P_WHILE_2, (200, 1024) BOOST_PP_TUPLE_EAT_3)(BOOST_PP_SUB_P,
‎ BOOST_PP_SUB_O, BOOST_PP_IIF_BOOST_PP_BOOL_1024(BOOST_PP_S
‎UB_O, BOOST_PP_NIL BOOST_PP_TUPLE_EAT_2)(2, (200, 1024)))(c
‎har x2[200], char* x2=(char *)malloc(200*sizeof(char)));

We have hit a limitation of preprocessor-based arithmetics: the only way to implement BOOST_PP_LESS_EQUAL(x,y) consists in precalculating the result for all pairs (x,y) with x and y between 0 and some internal limit, which Boost.Preprocessor sets at 256; so, BOOST_PP_LESS_EQUAL(x,1024) simply does not compute. Rising the internal limit does not scale, as the size of the implementation header size grows linearly with this limit.

A way to overcome these limitations in preprocessor-based arithmetic is to replace simple numerical literals with some other representation more amenable to preprocessor handling. This is the way arbitrary-precision arithmetic support is provided in Paul Mensonides's Chaos Library, which can be regarded as an evolution of Boost.Preprocessor for very conformant C preprocessors. For the fun of it, we will develop our solution within Boost.Preprocessor. This library makes it very easy to work with so-called sequences, lists of adjacent parenthesized elements:

// sequence a, b, c
(a)(b)(c)

We define a preprocessor bignum simply as a sequence whose elements are digits between 0 and 9:

// bignum 1024
(1)(0)(2)(4)

Elements are assumed to be laid out from the most to the least significant digit, though the reverse order would probably be a little more natural for computational purposes. The largest bignum we can handle with Boost.Preprocessor is 10257 − 1, as BOOST_PP_LIMIT_MAG = 256 is the maximum sequence length supported by the library. This magnitude is much larger than what is usually representable by integer types in C; when a bignum is sufficiently small to be represented by a C integer, we can devise a macro to convert the bignum to a literal:

‎#define BN_TO_LITERAL(bn) \
‎BOOST_PP_IF( \
‎ BOOST_PP_GREATER(BOOST_PP_SEQ_SIZE(bn),1), \
‎ BN_TO_LITERAL_CASE1, \
‎ BN_TO_LITERAL_CASE2)(bn)

‎#define BN_TO_LITERAL_CASE1(bn) \
‎BOOST_PP_SEQ_FOLD_LEFT( \
‎ BN_TO_LITERAL_CASE1_OP, \
‎ BOOST_PP_SEQ_HEAD(bn), \
‎ BOOST_PP_SEQ_TAIL(bn))

‎#define BN_TO_LITERAL_CASE1_OP(s,state,x) BOOST_PP_CAT(state,x)

‎#define BN_TO_LITERAL_CASE2(bn) BOOST_PP_SEQ_HEAD(bn)

‎BN_TO_LITERAL((1)(0)(2)(4)) // expands to 1024
Note that we had to treat bignums of just one digit separately because the general branch would produce intermediate "empty" sequences, which are not allowed.

Now, if we had an appropriate BN_LESS_EQUAL macro to compare bignums, we could rewrite our original example like this and forget about the BOOST_PP_LIMIT_MAG limit of Boost.Preprocessor:

‎‎#define ARRAY(name,type,bn) \
‎BOOST_PP_IF( \
‎ BN_LESS_EQUAL(bn,(1)(0)(2)(4)), \
‎ type name[BN_TO_LITERAL(bn)], \
‎ type* name=(type *)malloc(BN_TO_LITERAL(bn)*sizeof(type)))

‎ARRAY(x1,char,(6)(4)); // stack-based
‎ARRAY(x2,char,(2)(0)(0)); // stack-based
‎ARRAY(x3,char,(1)(5)(0)(0)); // dynamic
We will show in later entries how to implement bignum comparison macros as well as common arithmetic operations.

No comments:

Post a Comment