GotW #63

Home Blog Talks Books & Articles Training & Consulting

On the
blog
RSS feed November 4: Other Concurrency Sessions at PDC
November 3
: PDC'09: Tutorial & Panel
October 26: Hoare on Testing
October 23
: Deprecating export Considered for ISO C++0x

This is the original GotW problem and solution substantially as posted to Usenet. See the book Exceptional C++ Style (Addison-Wesley, 2004) for the most current solution to this GotW issue. The solutions in the book have been revised and expanded since their initial appearance in GotW. The book versions also incorporate corrections, new material, and conformance to the final ANSI/ISO C++ standard (1998) and its Technical Corrigendum (2003).

Amok Code 
Difficulty: 4 / 10

Sometimes life hands you some debugging situations that seem just plain deeply weird. Try this one on for size, and see if you can reason about possible causes for the problem.

Problem

1. One programmer has written the following code:

//--- file biology.h
//

// ... appropriate includes and other stuff ...

class Animal
{
public:
  // Functions that operate on this object:
  //
  virtual int Eat    ( int ) { /*...*/ }
  virtual int Excrete( int ) { /*...*/ }
  virtual int Sleep  ( int ) { /*...*/ }
  virtual int Wake   ( int ) { /*...*/ }

  // For animals that were once married, and
  // sometimes dislike their ex-spouses, we provide:
  //
  int EatEx    ( Animal* a ) { /*...*/ };
  int ExcreteEx( Animal* a ) { /*...*/ };
  int SleepEx  ( Animal* a ) { /*...*/ };
  int WakeEx   ( Animal* a ) { /*...*/ };

  // ...
};

// Concrete classes.
//
class Cat    : public Animal { /*...*/ };
class Dog    : public Animal { /*...*/ };
class Weevil : public Animal { /*...*/ };
// ... more cute critters ...

// Convenient, if redundant, helper functions.
//
int Eat    ( Animal* a ) { return a->Eat( 1 );     }
int Excrete( Animal* a ) { return a->Excrete( 1 ); }
int Sleep  ( Animal* a ) { return a->Sleep( 1 );   }
int Wake   ( Animal* a ) { return a->Wake( 1 );    }

Unfortunately, the code fails to compile. The compiler rejects the definition of at least one of the ...Ex() functions with an error message saying the function has already been defined.

To get around the compile error, the programmer comments out the ...Ex() functions, and now the program compiles and he starts testing the sleeping functions. Unfortunately, the Animal::Sleep() member function doesn't seem to always work correctly; when he tries to call the member function Animal::Sleep() directly, all is well. But when he tries to call it through the Sleep() free function wrapper, which does nothing but call the member function version, sometimes nothing happens... not all the time, only in some cases. Finally, when the programmer goes into the debugger or the linker- generated symbol map in an attempt to diagnose the problem, he can't seem to even find the code for Animal::Sleep() at all.

Is the compiler on the fritz? Should the programmer send the compiler vendor an angry flame e-mail and submit an irate letter to the New York Times? Is it a Year 2000 problem? Or is it just due to a rogue virus caught from the Internet?

What's going on?

Solution

1. One programmer has written the following code:

[...]

What's going on?

In short, there are several things that might be going on to cause these symptoms, but there's one outstanding possibility that would account for all of the observed behavior: Yes, you guessed it, an ugly combination of macros running amok and a side dish of mixed intentional and unintentional overloading.

Motivation

Certain popular C++ programming environments give you macros that are deliberately designed to change function names. Usually they do this for "good" or "innocent" reasons, namely backward and forward API compatibility; for example, if a Sleep() function in one version of an operating system is replaced by a SleepEx(), the vendor supplying the header in which the functions are declared might decide to "helpfully" provide a macro that automatically changes Sleep to SleepEx.

This is not a good idea. Macros are the antithesis of encapsulation, because their actual range of effect cannot be controlled, not even by the macro writer.

Macros Don't Care

Macros are obnoxious, smelly, sheet-hogging bedfellows for several reasons, most of which are related to the fact that they are a glorified text-substitution facility whose effects are applied during preprocessing, before any C++ syntax and semantic rules can even begin to apply. The following are some of macros' less charming habits.

1. Macros change names -- more often than not to harm, not protect, the innocent.

It is an understatement to say that this silent renaming can make debugging somewhat confusing. Such macro renaming means that your functions aren't actually called what you think they're called.

For example, consider our nonmember function Sleep():

int Sleep  ( Animal* a ) { return a->Sleep( 1 );   }

You won't find Sleep() anywhere in the object code or the linker map file because there's really no Sleep() function at all. It's really called SleepEx(). At first, when you're wondering where your Sleep() went, you might think, "ah, maybe the compiler is automatically inlining Sleep() for me," because that would explain why the short function doesn't seem to exist -- although of course the compiler shouldn't be inlining things without first being so directed. If you jump to conclusions and fire off an angry email to your compiler vendor complaining about aggressive optimizations, though, you're blaming the wrong company (or, at least, the wrong department).

Some of you may already have encountered this unfortunate and demonstrably bad effect. If you're like me, which is to say easily irritated by compiler weirdnesses and not satisfied with simple explanations, your curiosity bump might get the better of you. Then, curious, you may fire up the debugger and deliberately step into the function... only to be taken to the correct source line where the phantom function (which still looks like it has its original name, in the source code) lives, stepping into the phantom function that indeed works and is indeed getting called, but which by all other accounts doesn't seem to exist. Usually at this point it's a short step to figuring out what's really going on and a sotto voce grumbling at stupid macro tricks.

But wait, it gets better:

1(b). C++ already has features to deal with names. This causes what might be best termed an "unhealthy interaction."

You might think that it's not such a big deal to change a function's name. All right, fine, often it's not. But say you change a function's name to be the same as another function that also exists... what does C++ do if it finds two functions with the same name? It overloads them. That's not quite so fine when you don't realize it's happening silently.

This, alas, seems to be the case with Sleep(). The whole reason the library vendor decided to "helpfully" provide a Sleep macro to automatically rename things to SleepEx is because both such functions in fact do already exist in the vendor's library. Consider that the functions may have different signatures; then when we write our own Sleep() function, we might well be aware of the overloading on the library-supplied Sleep() and take care to avoid ambiguities or other errors that might cause us problems. We might even rely on the overloading because we want to provide library-Sleep()-like behavior intentionally. If, instead, our function is getting silently overloaded with some other function, not only is the overload going to behave differently than we expect, but if our original overloading was intentional it's not going to happen at all, at least not in the way we thought.

In the context of our question, such a Sleep-renaming macro can partly explain why different functions could end up being called in different circumstances; which function gets called can depend on how the overload resolution happened to work out for the specific types used at different call sites. Sometimes it's ours, and sometimes it's the library's. It just depends, perhaps in nonobvious ways.

If this sordid tale ended with the above lamentable effects on nonmember functions, that would be bad enough. Unfortunately, like shrapnel from a grenade, there's dangerous stuff flying in several other directions too:

2. Macros don't care about type.

The original intent for the Sleep macro described above was to change a global nonmember function's name. Unfortunately, the macro will change the name Sleep wherever it finds it; if we happen to have a global variable named Sleep, that member's name will silently change too. This is altogether a bad thing.

3. Macros don't care about scope.

Worse still, a macro designed to change a global nonmember function name will happily change any matching function (or other) names that happen to be members of your classes or nicely encapsulated within your own namespaces. In this case, we wrote a class with both Sleep() and SleepEx() functions; many of the described problems could be accounted for at least in part by a Sleep-renaming macro that makes our own functions invisibly overload -- with each other. Indeed, as with the invisible overloading mentioned above under point #1, this can explain why sometimes an unexpected member function can be called, depending on how the overload resolution happened to work out for the specific types used at different call sites.

If you think this is a Bad Thing, you're right. It's like having some ungloved uncertified doctor (the injudicious library header writer) with dirty hands (unsterilized macros) slice open your torso (class or namespace) and reach into your body cavity to rearrange things (members and other code)... while sleepwalking (not even realizing they're doing it).

Summary

In short, macros don't care about much of anything.

Guideline: Avoid macros.

Your default response to macros should be something like "Macros! Ew, yuck!" unless there's a compelling reason to use them in special cases where they are not a hack. Macros are not type-safe, they're not scope-safe... they're just not safe, period. If you must write macros, avoid putting them in header files, and try to give them long and personalized names that are highly unlikely to ever tromp upon things that they aren't intentionally meant to beat into unwilling submission and grind under their heels into the dust.

Guideline: Prefer to use namespaces.

True, as noted above, macros don't respect scope, and that includes namespace scope. However, had the free functions in the question been put inside a namespace, it would have prevented at least some of the problems; specifically, it would have avoided the global overloading with the vendor-supplied functions.

In short, practice good encapsulation. Not only does good encapsulation make for better designs, but it can defend you against unexpected threats you might not even see coming. Macros are the antithesis of encapsulation, because their actual range of effect cannot be controlled, not even by the macro writer. Classes and namespaces are among C++'s useful tools to help manage and minimize interdependencies between parts of your program that should be unrelated, and judicious use of these and other C++ facilities to promote encapsulation will not just make for superior designs, but will at the same time offer a measure of protection against the shrapnel that ill-considered code from fellow programmers, however well-intentioned, might occasionally send your way.

Copyright © 2009 Herb Sutter