GotW #25: SPECIAL EDITION: auto

GotW #25

On the
blog

November 4: Other Concurrency Sessions at PDC
November 3: PDC'09: Tutorial & Panel

October 26: Hoare on Testing
October 23: Deprecating export Considered for ISO C++0x

This is the original GotW problem and solution substantially as posted to Usenet. See the book Exceptional C++ (Addison-Wesley, 2000) for the most current solutions to GotW issues #1-30. The solutions in the book have been revised and expanded since their initial appearance in GotW. The book versions also incorporate corrections, new material, and conformance to the final ANSI/ISO C++ standard.

SPECIAL EDITION: auto_ptr
Difficulty: 8 / 10

This GotW covers basics about how you can use the standard auto_ptr safely and effectively. (This GotW Special Edition was written in honor of the voting out of the Final Draft International Standard for Programming Language C++, which included a last-minute auto_ptr change.)

Problem

Comment on the following code: What's good, what's safe, what's legal, and what's not?

    auto_ptr<T> source() { return new T(1); }
    void sink( auto_ptr<T> pt ) { }

    void f() {
        auto_ptr<T> a( source() );
        sink( source() );
        sink( auto_ptr<T>( new T(1) ) );

        vector< auto_ptr<T> > v;
        v.push_back( new T(3) );
        v.push_back( new T(4) );
        v.push_back( new T(1) );
        v.push_back( a );
        v.push_back( new T(2) );
        sort( v.begin(), v.end() );

        cout << a->Value();
    }

    class C {
    public:    /*...*/
    protected: /*...*/
    private:
        auto_ptr<CImpl> pimpl_;
    };

Solution

Comment on the following code: What's good, what's safe, what's legal, and what's not?

STANDARDS UPDATE: This week [the week this GotW was posted], at the WG21/J16 meeting in Morristown NJ USA, the Final Draft International Standard (FDIS) for Programming Language C++ was voted out for balloting by national bodies. We expect to know by the next meeting (Nice, March 1998) whether it has passed and will become an official ISO Standard.

This GotW was posted knowing that auto_ptr was going to be refined at the New Jersey meeting in order to satisfy national body comments. This Special Edition of GotW covers the final auto_ptr, how and why been made safer and easier to use, and how to use it best.

In summary:

1. All legitimate uses of auto_ptr work as before, except that you can't use (i.e., dereference) a non-owning auto_ptr.

2. The dangerous abuses of auto_ptr have been made illegal.

SOME WELL-DESERVED ACKNOWLEDGMENTS: Many thanks from all of us to Bill Gibbons, Greg Colvin, Steve Rumsby, and others who worked hard on the final refinement of auto_ptr. Greg in particular has laboured over auto_ptr and related classes for many years to satisfy various committee concerns and requirements, and deserves public recognition for that work.

Background

The original motivation for auto_ptr was to make code like the following safer:

    void f() {
        T* pt( new T );
        /*...more code...*/
        delete pt;
    }

If f() never executes the delete statement (either because of an early return or by an exception thrown in the function body), the allocated object is not deleted and we have a classic memory leak.

A simple way to make this safe is to wrap the pointer in a "smarter" pointer-like object which owns the pointer and which, when destroyed, deletes the pointer automatically:

    void f() {
        auto_ptr<T> pt( new T );
        /*...more code...*/
    } // cool: pt's dtor is called as it goes out of
      // scope, and the allocated object is deleted

Now the code will not leak the T object, no matter whether the function exits normally or by means of an exception, because pt's destructor will always be called during stack unwinding. Similarly, auto_ptr can be used to safely wrap pointer data members [note: there are important safety details not mentioned in this GotW; see later GotW issues including GotW #62, and the book Exceptional C++]:

    // file c.h
    class C {
    public:
        C();
        /*...*/
    private:
        auto_ptr<CImpl> pimpl_;
    };

    // file c.cpp
    C::C() : pimpl_( new CImpl ) { }

Now the destructor need not delete the pimpl_ pointer, since the auto_ptr will handle it automatically. We'll revisit this example again at the end.

Sources and Sinks

This is cool stuff all by itself, but it gets better. Based on Greg Colvin's work and experience at Taligent, people noticed that if you defined copying for auto_ptrs then it would be very useful to pass them to and from functions, as function parameters and return values.

This is in fact the way auto_ptr worked in the second committee draft (Dec 1996), with the semantics that the act of copying an auto_ptr transfers ownership from the source to the target. After the copy, only the target auto_ptr "owned" the pointer and would delete it in due time, while the source also still contained the same pointer but did not "own" it and therefore would not delete it (else we'd have a double delete). You could still use the pointer through either an owning or a non-owning auto_ptr object.

For example:

    void f() {
        auto_ptr<T> pt1( new T );
        auto_ptr<T> pt2;

        pt2 = pt1;  // now pt2 owns the pointer, and
                    // pt1 does not

        pt1->DoSomething(); // ok (before last week)
        pt2->DoSomething(); // ok

    } // as we go out of scope, pt2's dtor deletes the
      // pointer, but pt1's does nothing

This gets us to the first part of the GotW code:^[1]

    auto_ptr<T> source() { return new T(1); }
    void sink( auto_ptr<T> pt ) { }

  SUMMARY
  |         Before NJ   After NJ
  | Legal?     Yes         Yes
  | Safe?      Yes         Yes

This demonstrates exactly what the people at Taligent had in mind:

1. source() allocates a new object and returns it to the caller in a completely safe way, by letting the caller assume ownership of the pointer. Even if the caller ignores the return value (of course, you would never write code that ignores return values, right?), the allocated object will always be safely deleted.

See also GotW #21, which demonstrates why this is an important idiom, since returning a result by wrapping it in an auto_ptr is sometimes the only way to make a function strongly exception-safe.

2. sink() takes an auto_ptr by value and therefore assumes ownership of it. When sink() is done, the deletion is done (as long as sink() itself hasn't handed off ownership to someone else). Since the sink() function as written above doesn't do anything with the body, calling "sink( a );" is a fancy way of writing "a.release();".

The next piece of code shows source() and sink() in action:

    void f() {
        auto_ptr<T> a( source() );

  SUMMARY
  |         Before NJ   After NJ
  | Legal?     Yes         Yes
  | Safe?      Yes         Yes

Here f() takes ownership of the pointer received from source(), and (ignoring some problems later in f()) it will delete it automatically when the automatic variable a goes out of scope. This is fine, and it's exactly how passing back an auto_ptr by value is meant to work.

        sink( source() );

  SUMMARY
  |         Before NJ   After NJ
  | Legal?     Yes         Yes
  | Safe?      Yes         Yes

Given the trivial (i.e., empty) definitions of source() and sink() here, this is just a fancy way of writing "delete new T(1);". So is it really useful? Well, if you imagine source() as a nontrivial factory function and sink() as a nontrivial consumer, then yes, it makes a lot of sense and crops up regularly in real-world programming.

        sink( auto_ptr<T>( new T(1) ) );

  SUMMARY
  |         Before NJ   After NJ
  | Legal?     Yes         Yes
  | Safe?      Yes         Yes

Again, a fancy way of writing "delete new T(1);", and a useful idiom when sink() is a nontrivial consumer function that takes ownership of the pointed-to object.

Things Not To Do, and Why Not To Do Them

"So," you say, "that's cool, and obviously supporting auto_ptr copying is a Good Thing." Well, yes, it is, but it turns out that it can also get you into hot water where you least expect it, and that's why the national body comments objected to leaving auto_ptr in the CD2 form. Here's the fundamental issue, and I'll highlight it to make sure it stands out:

For auto_ptr, copies are NOT equivalent.

It turns out that this has important effects when you try to use auto_ptr with generic code that does make copies and isn't necessarily aware that copies aren't equivalent (after all, usually copies are!). Consider:

        vector< auto_ptr<T> > v;

  SUMMARY
  |         Before NJ   After NJ
  | Legal?     Yes          No
  | Safe?       No          No

This is the first indication of trouble, and one of the things the national body comments wanted fixed. In short, even though a compiler wouldn't burp a single warning here, auto_ptrs are NOT safe to put in containers. This is because we have no way of warning the container that copying auto_ptrs has unusual semantics (transferring ownership, changing the right-hand side's state). True, today most implementations I know about will let you get away with this, and code nearly identical to this even appears as a "good example" in the documentation of certain popular compilers. Nevertheless, it was actually unsafe (and is now illegal).

The problem is that auto_ptr does not quite meet the requirements of a type you can put into containers, for copies of auto_ptrs are not equivalent. For one thing, there's nothing that says a vector can't just decide to up and make an "extra" internal copy of some object it contains. Sure, normally you can expect vector not to do this (simply because making extra copies happens to be unnecessary and inefficient, and for competitive reasons a vendor is unlikely to ship a library that's needlessly inefficient), but it's not guaranteed and so you can't rely on it.

But hold on, because it's about to get worse:

        v.push_back( new T(3) );
        v.push_back( new T(4) );
        v.push_back( new T(1) );
        v.push_back( a );

(Aside: Note that copying a into the vector means that the 'a' object no longer owns the pointer it's carrying. More on that in a moment.)

        v.push_back( new T(2) );
        sort( v.begin(), v.end() );

  SUMMARY
  |         Before NJ   After NJ
  | Legal?     Yes          No
  | Safe?       No          No

Here's the real devil, and another reason why the national body comment was more that just a suggestion (the body in question actually voted No on CD2 largely because of this problem). When you call generic functions that will copy elements, like sort() does, the functions have to be able to assume that copies are going to be equivalent. For example, at least one popular sort internally takes a copy of a "pivot" element, and if you try to make it work on auto_ptrs it will merrily take a copy of the pivot auto_ptr object (thereby taking ownership and putting it in a temporary auto_ptr on the side), do the rest of their work on the sequence (including taking further copies of the now-non-owning auto_ptr that was picked as a pivot value), and when the sort is over the pivot is destroyed and you have a problem: at least one auto_ptr in the sequence (the one that was a copy of the pivot value) no longer owns the pointer it holds, and in fact the pointer it holds has already been deleted!

The problem with the auto_ptr in CD2 is that it gave you no protection -- no warning, nothing -- against innocently writing code like this. The national body comment required that auto_ptr be refined to either get rid of the unusual copy semantics or else make such dangerous code uncompilable, so that the compiler itself could stop you from doing the dangerous things, like making a vector of auto_ptrs or trying to sort it.

The Scoop on Non-Owning auto_ptrs

        // (after having copied a to another auto_ptr)
        cout << a->Value();
    }

  SUMMARY
  |         Before NJ   After NJ
  | Legal?     Yes          No
  | Safe?     (Yes)         No

(We'll assume that a was copied, but that its pointer wasn't deleted by the vector or the sort.) Under CD2 this was fine, since even though a no longer owns the pointer, it would still contain a copy of it; a just wouldn't call delete on its pointer when a itself goes out of scope, that's all, because it would know that it doesn't own the pointer.

Now, however, copying an auto_ptr not only transfers ownership but resets the source auto_ptr to null. This is done specifically to avoid letting anyone do anything through a non-owning auto_ptr. Under the final rules, then, using a non-owning auto_ptr like this is not legal and will result in undefined behaviour (typically a core dump on most systems).

In short:

    void f() {
        auto_ptr<T> pt1( new T );
        auto_ptr<T> pt2( pt1 );
        pt1->Value(); // using a non-owning auto_ptr...
                      //  this used to be legal, but is
                      //  now an error
        pt2->Value(); // ok
    }

This brings us to the last common usage of auto_ptr:

Wrapping Pointer Members

    class C {
    public:    /*...*/
    protected: /*...*/
    private:
        auto_ptr<CImpl> pimpl_;
    };

[Note: there are important safety details not mentioned in this GotW; see later GotW issues including GotW #62, and the book Exceptional C++.]

  SUMMARY
  |         Before NJ   After NJ
  | Legal?     Yes         Yes
  | Safe?      Yes         Yes

auto_ptrs always were and still are useful for encapsulating pointing member variables. This works very much like our motivating example in the "Background" section at the beginning, except that instead of saving us the trouble of doing cleanup at the end of a function, it now saves us the trouble of doing cleanup in C's destructor.

There is still a caveat, of course... just like if you were using a bald pointer data member instead of an auto_ptr member, you MUST supply your own copy constructor and copy assignment operator for the class (even if you disable them by making them private and undefined), because the default ones will do the wrong thing.

News Flash: The "const auto_ptr" Idiom

Now that we've waded through the deeper stuff, here's a technique you'll find interesting. Among its other benefits, the refinement to auto_ptr also makes copying const auto_ptrs illegal. That is:

    const auto_ptr<T> pt1( new T );
        // making pt1 const guarantees that pt1 can
        // never be copied to another auto_ptr, and
        // so is guaranteed to never lose ownership

    auto_ptr<T> pt2( pt1 ); // illegal
    auto_ptr<T> pt3;
    pt3 = pt1;              // illegal

This "const auto_ptr" idiom is one of those things that's likely to become a commonly used technique, and now you can say that you knew about it since the beginning.

I hope you enjoyed this Special Edition of GotW, posted in honour of the voting out of ISO Final Draft International Standard C++ [in November 1997].

Notes

1. In the original question, I forgot that there is no conversion from T* to auto_ptr<T> because the constructor is "explicit". The quoted code below is fixed. (That's what I get for dashing this off near midnight on Friday before rushing to New Jersey!)

SPECIAL EDITION: auto_ptrDifficulty: 8 / 10