When Is a Container Not...?

 On theblog November 4: Other Concurrency Sessions at PDC November 3 October 26: Hoare on Testing October 23

 This is the original article substantially as first published. See the book More Exceptional C++ (Addison-Wesley, 2002) for the most current version of this article. The versions in the book have been revised and expanded since their initial appearance in print. The book versions also incorporate corrections, new material, and conformance to the final ANSI/ISO C++ standard.

## When Is a Container Not a Container?

Little Stanislaus watched his mother as she rummaged busily in the refrigerator. "Be a dear, Stan," she said to him, without looking around, "and get me a container I can pour this juice into."

Stanislaus was a good boy, and he was pretty sure he knew what a container was. He looked about, and saw just the thing: a bright and shiny plastic bowl with a mesh bottom. Well, little Stanislaus knew that a bowl was a container, so happily he fetched it and held it out for his mother. She, busy, started to pour the juice before she'd quite got a good look at what she was pouring it into, and as Stanislaus held the sieve with a helpful smile the juice ran merrily into the sieve... and through it, and onto the floor.

Stanislaus got a scolding that day. He didn't really think it was his fault, though. The bowl with the mesh bottom looked a lot like the other bowls; how was he to know that it didn't meet his mother's container requirements?

In this column, we'll take a look at some of the C++ standard's container requirements, some reasons to use the technique of taking pointers into a container, and--to your possible consternation and likely surprise--a standard container that isn't really a container at all.

#### Why Take Pointers or References Into Containers?[1]

Consider the following code:

// Example 1: Is this code valid? safe? good?
//
vector<char> v;

...

char* p = &v[0];

... do something with *p ...

Is this code valid? Well, the standard guarantees that if a conforming sequence (such as vector<char>) provides operator[], that function must return an lvalue of type char, which in turn can have its address taken. So the answer to the question is: Yes, as long as v is not empty, the code in Example 1 is perfectly valid.

Is this code safe? Many programmers are initially surprised at the idea of taking a pointer into a container, but the answer is: Yes, it's safe, as long as we remain aware of when the pointer might be invalidated, which is pretty much whenever an equivalent iterator would be invalidated. For example, clearly, if we decide to start inserting or erasing elements of the container, not only iterators but also any pointers into the container will be invalidated as the underlying memory moves.

But does this code make sense? Again, yes; it can make perfect sense to have pointers or references into a container. One common case is when you read a data structure into memory once, say on program startup, and thereafter you never modify it but you do need to access it in different ways. In that case it can make sense to have additional data structures that contain pointers into the main container to optimize different access methods. I'll give a simple example in the next section.

#### How Could Example 1 Be Improved?

Example 1 could be improved in one way:

// Example 1(b): An improvement
//               (when it's possible)
//
vector<char> v;

...

vector<char>::iterator i = v[0];

... do something with *i ...

In general, it's not a bad guideline to prefer using iterators instead of pointers when you want to point at an object that's inside a container. After all, iterators are invalidated at mostly the same times and the same ways as pointers, and one reason that iterators exist is to provide a way to "point" at a contained object. So, if you have a choice, prefer to use iterators into containers.

Unfortunately, you can't always get the same effect with iterators that you can with pointers into a container. There are two main potential drawbacks to the iterator method, and when either applies we have to continue to use pointers:

1.       You can't always conveniently use an iterator where you can use a pointer. (See example below.)

2.       Using iterators might incur extra space and performance overhead, in cases where the iterator is an object and not just a bald pointer.

For example, say that you have a map<Name,PhoneNumber> that is loaded once at program startup and thereafter is only queried. The idea is that, given a name, this makes it easy to look up the corresponding phone number in the fixed dictionary. But what if you need to do the reverse lookup too? A clean solution could be to build a second structure, perhaps a map<PhoneNumber*,Name*,DerefFunctor> that enables the reverse lookup but avoids doubling the storage overhead--because, using pointers, there's no need to store each name and phone number twice; the second structure simply has pointers into the first.

That's fine, and it will work well, but note that this effect would be more difficult to achieve using iterators instead of pointers. Why? Because the natural iterator candidate, map<Name,PhoneNumber>::iterator, points to a pair<Name,PhoneNumber>, and there's no handy way to get an iterator to just the name or phone number part individually. (We could store the entire iterator and always explicitly say ->first and ->second everywhere, but that's inconvenient to code and it would mean that the reverse-lookup map would need to be redesigned or replaced by a different structure.)

This brings us to the next (and in my opinion most interesting) point of today's theme:

// Example 2: Is this code valid?
//
template<class T>
void f( T& t )
{
typename T::value_type* p1 = &t[0];
typename T::value_type* p2 = &*t.begin();
// ... do something with *p1 and *p2 ...
}

Before reading on, think about these questions: Is this code valid? If yes, under what conditions is it valid? Specifically, what kinds of things can we say about a T that makes this code valid? (Ignore runtime considerations, such as whether t happens to be in a suitable state to have t[0] called on it; we're interested in program legality here.)

#### When Is a Container Not a Container?

Well, is Example 2 valid? In short, yes, it can be valid. The longer answer involves thinking about what kinds of T would make the code valid: What characteristics and abilities must a suitable T have? Let's do some detective work:

a)   To make the expression &t[0] valid, T::operator[] must exist and must return something that understands operator&, which in turn must return a valid T::value_type* (or something that can be meaningfully converted to a valid T::value_type*).

In particular, this is true of containers that meet the standard's container and sequence requirements and implement the optional operator[], because that operator must return a reference to the contained object. By definition, you can then take the contained object's address.

b)   To make the expression &*t.begin() valid, T::begin() must exist, and it must return something that understands operator*, which in turn must return something that understands operator&, which in turn must return a valid T::value_type* (or something that can be meaningfully converted to a valid T::value_type*).

In particular, this is true of containers whose iterators meet the standard's iterator requirements, because the iterator returned by begin() must, when dereferenced using operator*, return a reference to the contained object. By definition, you can then take the contained object's address.

#### Which Brings Us to the Awkward Part

So here's the thing: The code in Example 2 will work for any container in the standard library that supports operator[]--and, if you take away the line containing "&t[0]", it will work for every container in the standard library--except for std::vector<bool>.

To see why, note that the following template works for every type T except bool.

// Example 3: Works for every T except bool
//
template<class T>
void g( vector<T>& v )
{
T* p = &v.front();
// ... do something with *p ...
}

Do you see why? At first (or even second and third) glance, it may seem odd that this is valid for all types but one. What makes vector<bool> so special?

The reason is as simple as it is unfortunate: Not all of the templates defined in the C++ Standard Library that look like containers actually are containers. In particular, the standard library requires a specialization of vector for bool, and the specialization

std::vector<bool> is not a container

in that it does not meet the standard library's requirements for containers. True, vector<bool> does appear in the "containers and sequences" part of the standard; and, true, there's no note to indicate that it's really neither a container nor a sequence. But the fact is that vector<bool> is indeed not a container, and so it should not be surprising that it cannot always be used like one.[2]

If this state of affairs seems a little strange to you, you're not alone, but there is a logical explanation.

#### The Life and Times of vector<bool>

The vector<bool> specialization was intentionally put into the standard to provide an example of how to write a proxied container. A "proxied container" is a container whose objects you don't get at directly; instead of giving you pointers or references to a contained object, a proxied container gives you proxy objects that can be used to indirectly access or manipulate a contained object. Proxied collections can be useful in cases where the objects within the collection cannot always be reliably accessed directly as though they were in memory, as for example with a disk-based collection that automatically pages pieces of itself in and out of memory under the covers as needed. So the idea was to show how to make such a proxied collection meet the requirements of a "container" in the sense defined by the standard library.

This would all have been well and good, except that the container and iterator requirements were never updated to allow for proxied containers. In fact, proxied containers are categorically disallowed; a container<T>::reference must be a true reference (T&), never a proxy. Iterators have similar requirements; dereferencing a forward, bidirectional, or random-access iterator must yield a true reference (T&), never a proxy. These points preclude any proxy-based containers from meeting the standard container requirements.

The reason vector<bool> is of necessity nonconforming is that it attempts to do extra work invisibly under the covers in an attempt to optimize for space. A normal bool object is at least as big as a char; sizeof(bool) is implementation-defined and will vary from compiler to compiler, but it must be at least 1. Does that seem extravagant? Some people thought so, so vector<bool> tries to be more efficient about the amount of storage it uses: Instead of storing a full char or int for every contained "bool", it packs the bools and stores them as individual bits (inside, say, chars or ints) in its internal representation. This gives vector<bool> a space advantage of at least 8:1, on platforms with 8-bit "normal" bools, and possibly more. (By this point, some of you may already have noticed that this optimization makes the name vector<bool> rather misleading: The things inside aren't really normal bools at all.)

One consequence of this packed representation is that vector<bool> clearly can't just return a normal bool& from its operator[] or its dereferenced iterators;[3] instead, it has to return a "reference-like" proxy object that in many ways looks and smells like a bool& but which is not a bool&. Unfortunately, that can also make access into a vector<bool> slower, because we have to deal with proxies instead of direct pointers and references (not to mention the extra bit-fiddling). So vector<bool> is not a pure optimization, but a tradeoff that favors "less space" at the expense of "potentially slower speed." More about this in a moment.

For example, whereas the functions vector<T>::operator[]() and vector<T>::front() normally return simply a T& (a reference to the contained T), vector<bool>::operator[]() and vector<bool>::front() actually return a "reference-like" proxy object of type vector<bool>::reference. To make this proxy object look and feel as much like a real bool& as possible, reference provides an implicit conversion to bool, operator bool(), so that a reference can be used most anywhere a bool can be used, at least as a value. It also provides member functions operator=(bool), operator=(reference&), and flip() that can be used to indirectly change the value of the bool that's actually inside the container, enabling natural coding styles like "v[0] = v[1];". (For greater amusement, note that the one service that vector<bool>::reference can't ever provide is a "bool* operator&();". That is, it can't provide a suitable address-of operator because there's no way to take the address of the individual bit inside the vector<bool> for which the reference object is serving as proxy.)

#### Proxied Containers and STL Assumptions

Bottom line: std::vector<bool> may be standard, but it's not a container.

Further, both the original STL's container requirements and the C++ standard's container requirements are based on the implicit assumption (among others) that dereferencing an iterator both is a constant-time operation and requires negligible time compared to other operations. As James Kanze correctly pointed out on the comp.std.c++ newsgroup a couple of years ago,[4] neither of these assumptions is true for a disk-based container or a packed-representation container. For a disk-based container, access through the proxy object may require a disk seek, which can be orders of magnitude slower than main memory access. To illustrate, consider that you would be unlikely to apply a standard algorithm like std::find() to a disk-based container; the performance would be abysmal compared to a special-purpose replacement, largely because the fundamental performance assumptions for in-memory containers do not apply to disk-based containers. (For similar reasons, even in-memory containers like std::map provide their own find as a member function.) For a packed-representation container like vector<bool>, access through the proxy object requires bitwise operations, which are generally much slower than manipulation of a native type like an int. Further, whenever the proxy object construction and destruction can't be completely optimized away by the compiler, managing the proxies themselves adds further overhead.

Although vector<bool> was intended to be an example of how to write a proxied container, so that other programmers could follow its example when writing disk-based or other containers whose contained objects cannot reliably be accessed directly, it also serves to demonstrate that the standard's container requirements do not allow proxied containers.

Proxied collections are a useful tool and can often be appropriate, especially for collections that can become very large. Every programmer should know about them. They just don't fit as well into STL as many people thought, that's all. Even vector<bool> is still a perfectly good model for how to write a proxied collection; it's just not a container, that's all, and it should be called something else (some people asked for "bitvector") so that people wouldn't think that it has anything to do with a conforming container.

#### Beware Premature Optimization

If you're a regular reader of this column, you'll already be familiar with my regular harangues against premature optimization.[5] The rules boil down to: "1. Don't optimize early. 2. Don't optimize until you know that it's needed. 3. Even then, don't optimize until you know what needed, and where."

By and large, programmers--that includes you and me--are notoriously bad at guessing the actual space/time performance bottlenecks in their own code. If you don't have performance profiles or other empirical evidence to guide you, you can easily spend days optimizing something that doesn't need optimizing and that won't measurably affect runtime space or time performance. What's even worse, however, is that when you don't understand what needs optimizing you may actually end up pessimizing (degrading your program) by of saving a small cost while unintentionally incurring a large cost.[6] Once you've run performance profiles and other tests, and you actually know that a particular optimization will help you in your particular situation, then it's the right time to optimize.

By now, you can probably see where I'm going with this: The most premature optimization of all is an optimization that's enshrined in the standard. In particular, vector<bool> intentionally favors "less space" at the expense of "slower speed"--and forces this optimization choice on all programs. This optimization choice implicitly assumes that virtually all users of a vector of bools will prefer "less space" at the expense of "potentially slower speed," that they will be more space-constrained than performance-constrained, and so on. This is clearly untrue; on many popular implementations a bool is the same size as an 8-bit char, and a vector of 1,000 bools consumes about 1K of memory. Saving that 1K is unimportant in many applications. (Yes, clearly a 1K space saving is important in some environments, such as embedded systems, and clearly some applications may manipulate vectors containing millions of bools where the saved megabytes are real and significant.) The point is that the correct optimization depends on the application: If you are writing an application that manipulates a vector<bool> with 1,000 entries in an inner loop, it is much more likely that you would benefit from potentially faster raw performance due to reduced CPU workload (without the overhead of proxy objects and bit-fiddling) than from the marginal space savings, even if the space savings would reduce cache misses.

#### So What Should You Do?

If you are using vector<bool> in your code, you may be using it with algorithms that expect a real container (and that hence may or may not work portably), and you are using an optimization that favors space over time (possibly imperceptibly). Either point might be inappropriate for your application. The easy way to discover the first is to use the code until it fails to compile; that day may never come. The only way to discover the second is by running empirical tests to measure your code's performance the way it uses vector<bool>; in many cases, the performance difference compared to something like vector<int>, if there is one, may well be negligible.

If you are being affected by vector<bool>'s non-container-ness, or if the measured performance difference is not negligible in your environment and the difference is material in your application, then don't use vector<bool>. The easiest and most portable solution is to use a workaround to get past the standard's requirements: Use a vector<char> or vector<int> instead (that is, store a bool as a char or an int), and use casts to bool when accessing values in the container.

In summary:

1.       vector<bool> is not a container.

2.       vector<bool> attempts to illustrate how to write standard-conforming proxied containers that do extra work invisibly under the covers. Unfortunately, that's not a sound idea, because although a proxied collection can be an important and useful tool, by definition it must violate the standard's container requirements and therefore can never be a conforming container. (See #1.)

Corollary: Go ahead and write proxied collections, but don't attempt to write ones that still meet the standard's container or sequence requirements. First, it's not possible. Second, the main reason to conform to the standard container requirements is to be used with the standard algorithms, yet the standard algorithms are typically inappropriate for proxied containers because proxied containers have different performance characteristics than plain-and-in-memory containers.

3.       vector<bool>'s name is a trifle misleading because the things inside aren't even standard bools. A standard bool is at least as big as a char, so that it can be used "normally." So, in fact, vector<bool> does not even really store bools, despite the name.

4.       vector<bool> forces a specific optimization on all users by enshrining it in the standard. That's probably not a good idea, even if the actual performance overhead turns out to be negligible for a given compiler for most applications; different users have different requirements.

A last word of advice: Optimize wisely. Never succumb to the temptation to perform premature optimizations, until you have empirical evidence that the optimization is beneficial in your situation.

#### Notes

1. In this article, I'm using different phrases for this term--for example, "pointer into a container," "pointer to an object inside a container," and "pointer to a contained object"--but they are all intended to mean the same thing.

2. If anyone else had written vector<bool>, it would have been called "nonconforming" and "nonstandard."  Well, it's in the standard, so that makes it a little bit harder to call it those names at this point. One correct solution to this discrepancy would be to remove clause 23.2.5, the vector<bool> specialization requirement, so that vector<bool> would be just another instantiation of the primary vector<> template and so would really be what it claims to be: a vector of plain old bools. (Besides, many aspects of vector<bool> are redundant; std::bitset was designed for this kind of packed-representation work.)

3. This is because currently there is no standard way to express a pointer or a reference to a bit.

4. For all the gory details, surf to DejaNews and do a power search for Subject="vector and bool" and Forum="*c++*". The discussions took place in Jan/Feb 1997. You will also find more recent discussions from people asking how to turn off the vector<bool> specialization; see the end of this article for my advice.

5. See also the solution to Guru of the Week #33 (http://www.gotw.ca).

6. For an interesting example of this, see also the article coming this summer in C/C++ Users Journal about "Optimizations That Aren't (In a Multithreaded World)" for a case in point. That article dissects a popular optimization that turns out to be, quite unintentionally, an optimization principally for single-threaded environments, and which can actually be a distinct pessimization for (possibly-)multithreaded code.