GotW #77: #Definition

GotW #77

On the
blog

November 4: Other Concurrency Sessions at PDC
November 3: PDC'09: Tutorial & Panel

October 26: Hoare on Testing
October 23: Deprecating export Considered for ISO C++0x

This is the original GotW problem and solution substantially as posted to Usenet. See the book More Exceptional C++ (Addison-Wesley, 2002) for the most current solution to this GotW issue. The solutions in the book have been revised and expanded since their initial appearance in GotW. The book versions also incorporate corrections, new material, and conformance to the final ANSI/ISO C++ standard.

#Definition
Difficulty: 4 / 10

What can and can't macros do? Not all compilers agree.

Problem

JG Question

1. Demonstrate how to write a simple max() preprocessor macro that takes two arguments and evaluates to the one that is greater, using normal < comparison. What are the usual pitfalls in writing such a macro?

Guru Question

2. What can a preprocessor macro not create? Why not?

Solution

Common Macro Pitfalls

There are four major pitfalls, besides several further drawbacks. Focusing on the pitfalls first, here are common ways to go wrong when writing a macro.

1. Don't forget to put parentheses around arguments.

  // Example 1(a): Paren pitfall #1: arguments
  //
  #define max(a,b) a < b ? b : a

The problem here is that the uses of the parameters a and b are not fully parenthesized. Macros only do straight textual substitution, so this can cause some unexpected results. For example:

  max( i += 3, j )

expands to the following:

  i += 3 < j ? j : i += 3

which, because of operator precedence and language rules, actually means:

  i += ((3 < j) ? j : i += 3)

This could cause some long debugging sessions.

2. Don't forget to put parentheses around the whole expression.

Fixing the first problem, we still fall prey to another subtlety:

  // Example 1(b): Paren pitfall #2: expansion
  //
  #define max(a,b) (a) < (b) ? (b) : (a)

The problem now is that the entire expansion is not correctly parenthesized. For example:

  k = max( i, j ) + 42;

expands to the following:

  k = (i) < (j) ? (j) : (i) + 42;

which, because of operator precedence, actually means:

  k = (((i) < (j)) ? (j) : ((i) + 42));

If i >= j, k is assigned the value of i+42, as intended. But if i < j, k is assigned the value of j.

3. Be aware of the results of any possible multiple evaluation.

We can fix problem #2 by putting parentheses around the entire macro expansion, but this leaves us with yet another problem:

  // Example 1(c): Multiple argument evaluation
  //
  #define max(a,b) ((a) < (b) ? (b) : (a))

Now, consider what happens if one or both of the expressions has side effects:

  max( ++i, j )

If the result of ++i >= j, i gets incremented twice, which is probably not what the programmer intended:

  ((++i) < (j) ? (j) : (++i))

Similarly, consider the code:

  max( f(), pi )

which expands to:

  ((f()) < (pi) ? (pi) : (f()))

If the result of f() >= pi, f() gets executed twice, which is almost certainly inefficient and often actually wrong.

Alas, although we could work around the first two problems, this one is a corker -- there is no solution as long as max is a macro.

4. Beware scope.

Finally, macros don't care about scope. (They don't care about much of anything; see GotW #63.) They just perform textual substitution no matter where the text may be. This means that, if we use macros at all, we have to be very careful about what we name them. In particular, the biggest problem with the max macro is that it is highly likely to interfere with the standard max() function template:

  // Example 1(d): Name tromping
  //
  #define max(a,b) ((a) < (b) ? (b) : (a))

  #include <algorithm> // oops!

The problem is that, inside header <algorithm>, there will be something like the following:

  template<class T> const T& 
  max(const T& a, const T& b);

which the macro "helpfully" turns into an uncompilable mess:

  template<class T> const T& 
  ((const T& a) < (const T& b) ? (const T& b) : (const T& a));

If you think that's easy to avoid by putting your macro definition after all #included header files (which really is a good idea in any case), just imagine what the macro does to all your other code that happens to have variables or other things that just happen to be named max.

If you have to write a macro, try to give it an unusual and hard-to-spell name that will be less likely to tromp on other names.

Other Macro Drawbacks

There are a few other major things a macro can't do:

5. Macros can't recurse.

We can write a recursive function, but it's impossible to write a recursive macro. As the C++ standard says, in 16.3.4/2:

If the name of the macro being replaced is found during this scan of the replacement list (not including the rest of the source file's pre- processing tokens), it is not replaced. Further, if any nested replacements encounter the name of the macro being replaced, it is not replaced. These nonreplaced macro name preprocessing tokens are no longer available for further replacement even if they are later (re)examined in contexts in which that macro name preprocessing token would otherwise have been replaced.

6. Macros don't have addresses.

It's possible to form a pointer to any free or member function (for example, to use it as a predicate), but it's not possible to form a pointer to a macro because a macro has no address. Why not should be obvious: Macros aren't code. A macro doesn't have any existence of its own, because all it is is a glorified (and not particularly glorious) text substitution rule.

7. Macros are debugger-unfriendly.

Besides the fact that macros change the underlying code before the compiler gets a chance to see it, and therefore can wreak havoc with variable and other names, a macro can't be stepped into during debugging.

Have you heard the one about the scientists who started experimenting on lawyers instead of on laboratory macros? It was because...

There Are Some Things Even a Macro Just Won't Do

There are valid reasons to use macros (see GotW #32), but there are limits. This brings us to the final question:

2. What can a preprocessor macro not create? Why not?

In the standard, clause 2.1 defines the phases of translation. Preprocessing directives and macro expansions take place in phase 4. Thus, on a compliant compiler, it is not possible for a macro to create:

- a trigraph (trigraphs are replaced in phase 1);
- a universal character name (\uXXX, replaced in phase 1);
- an end-of-line line-splicing backslash (replaced in phase 2);
- a comment (replaced in phase 3);
- another macro; or
- changes to a character literal or string literal via macro names inside the strings. For this last point, as noted in 16.3/8 footnote 7:

Since, by macro-replacement time, all character literals and string literals are preprocessing tokens, not sequences possibly containing identifier-like subsequences (see 2.1.1.2, translation phases), they are never scanned for macro names or parameters.

A recent CUJ article^[1] claimed that it's possible for a macro to create a comment:

  #define COMMENT SLASH(/)
  #define SLASH(s) /##s

This is nonstandard and not portable, but it actually works on some compilers. Why? Because those compilers don't implement the phases of translation exactly correctly. Here are the results from four compilers I tried:

Compiler	Accepts Comment Macro?
Microsoft Visual Studio.NET (Visual C++ version 7), beta 1	Yes (wrong)
Borland BCC 5.5.1	Yes (wrong)
GCC 2.95.2	No (correct)
Comeau 4.2.44	No (correct)

Notes

1. M. Timperley. "A C/C++ Comment Macro" (C/C++ Users Journal, 19(1), January 2001).

#Definition Difficulty: 4 / 10