Move It With Rvalue References

This entry is part of a series, RValue References: Moving Forward»

This is the second article in a series about efficient value types in C++. In the previous installment, we discussed how copy elision can be leveraged to eliminate many copies we might otherwise make. Copy elision is transparent, happens automatically in natural-looking code, and has almost no downside. So much for the good news; here’s the bad:

  1. Copy elision isn’t mandated by standard, so you can’t write portable code with the assurance that it will take effect.

  2. Sometimes it can’t be done. For example, in

    return q ? var1 : var2;
    the callee can use the memory area passed from the caller for at most one of var1 or var2. If it chooses to store var1 in that area and q turns out to be false, var2 still needs to be copied (and vice-versa).
  3. There are still many opportunities for copy elimination that lie beyond the reach of compilers’ stack allocation tricks.

Slow Shuffle

Many of these other opportunities for optimization occur up when an operation is fundamentally concerned with rearranging data. Take for example a simple generic insertion sort:1

template <class Iter>                                                  
void insertion_sort(Iter first, Iter last)                              
{                                                                       
    if (first == last) return;                                          
 
    Iter i = first;                                                     
    while (++i != last)     // Invariant: elements preceding i are sorted
    {                                                                   
        Iter next = i, prev = i;                                        
        if (*--prev > *i)                                               
        {                                                               
            typename std::iterator_traits<Iter>::value_type x(*next);  
            do *next = *prev;
            while(--next != first && *--prev > x);                      
            *next = x;
        }                                                               
    }                                                                   
}                                                                       
Line 7: Outer Loop Invariant

Line 7: Outer Loop Invariant

Line 12: copy first unsorted element to temp location

Line 12: copy first unsorted element to temp location

Line 13: copy last sorted element forward

Line 13: copy last sorted element forward

Line 13: keep copying forward until position found

Line 13: keep copying forward until position found

Line 15: assign temp into position

Line 15: assign temp into position

Now consider what happens when sorting a sequence of whose elements are std::vector<std::string>s: in lines 12, 13, and 15, we potentially copy a vector of strings, which involves lots of memory allocations and data copying.

Because a sort is a fundamentally data-conserving operation, though, these costs should be avoidable: all we really need to do in principle is to shuffle the objects around in the sequence.

The key thing to notice about these expensive copies is that, in all cases, the value of the source object will never be used again. Sound familiar? Yep, that’s the case when the source is an rvalue, too. In this case, however, the sources are lvalues: objects with known addresses.

What About Reference-Counting?

One popular way to address these inefficiencies is to allocate the elements on the heap and use sequences of reference-counted smart pointers to elements, rather than storing the elements directly. A reference-counted smart pointer is just like a regular pointer, except that it keeps track of how many other reference-counted smart pointers are referencing the object, and deletes the object when the last one goes away. Copying a reference-counted pointer just increments a reference count, and is very fast. Assigning a reference counted increments one reference count and decrements another. It is also very fast.

So, what could be faster? Not counting at all, of course! Also, reference counting has other drawbacks we’d like to avoid:

  1. It can actually be expensive in a multithreaded environment, since the count itself may be shared across threads, thus requiring synchronization.
  2. The approach breaks down in generic code, where the element type might end up being a lightweight type like int. In that case, the addition of reference counting can actually be a significant efficiency cost. You either end up paying that cost, or you have to introduce a complicated framework for deciding which types are lightweight enough to store directly and for accessing the values in a uniform manner.
  3. Reference semantics make code harder to understand. For example:

    typedef std::vector<std::shared_ptr<std::string> > svec;
    …
    svec s2 = s1;
    std::for_each( s2.begin(), s2.end(), to_uppercase() );

    The upper-casing of s2 also modifies the apparent value of s1. This is a bigger subject than we can treat fully here, but in short, when data sharing is hidden, seemingly-local modifications don’t necessarily have purely local effects.

Introducing C++0x Rvalue References

To help us solve these problems, C++0x introduces a new kind of reference, the rvalue reference. An rvalue reference to T is spelled T&& (pronounced “tee ref-ref”), and we now call regular T& references “lvalue references.” The main difference between lvalue and rvalue references for our purposes is that non-const rvalue references can bind to rvalues. Most C++ programmers have encountered an error message like this one at some point:

invalid initialization of non-const reference of type 'X&' 
from a temporary of type 'X'

Such a message usually results from code like

X f();            // call to f yields an rvalue
int g(X&);
int x = g( f() ); // error

The rules say that a non-const (lvalue) reference will bind to an lvalue, but not to a temporary (i.e. to an rvalue). It makes sense in a way, because any modifications made to the temporary through that reference are sure to be lost. 2 By contrast, a non-const rvalue reference will bind to a temporary, but not an lvalue:

X f();
X a;
int g(X&&);
 
int b = g( f() ); // OK
int c = g( a );   // ERROR: can't bind rvalue reference to an lvalue

Stealing Resources

Let’s say our function g() has to store a copy of its argument for later use.

static X cache;
 
int g(X&& a)
{
    cache = a;    // keep it for later
}
 
int b = g( X() ); // call g with a temporary
Depending on the type X, this copy might be a very expensive operation involving memory allocation and deep copies of many subobjects.

Since g()‘s parameter is an rvalue reference, we know that it will automatically bind only to unnamed temporaries, and nothing else. 3 Therefore,

  1. Shortly after we copy this temporary into cache, the source of the copy will be destroyed.
  2. Any modifications we might make to the temporary will be invisible to the rest of the program.

That gives us the ability to perform new kinds of optimizations that avoid work by mutating the value of the temporary. The most common optimization of this kind is resource stealing.

Resource stealing means taking the resources (e.g., memory, large subobjects) from one object and transferring them to another. For example, a String class might have ownership over a buffer of characters allocated on the heap. Copying a String involves allocating a new buffer and copying all of the characters into that new buffer, which is likely to be slow. Stealing from the String, however, requires only that another object take the String‘s buffer and notify the original (source) string that it no longer has a valid buffer—a far more efficient operation. Using rvalue references, we can optimize our code by changing copying from a temporary into stealing from the temporary. And, since only the temporary is affected, this optimization is logically non-mutating.

Insight:

Stealing from (or otherwise modifying) an rvalue reference should be considered a logically non-mutating operation

The Rvalue Overloading Idiom

This insight makes possible a new semantics-preserving program transformation: we can overload any function that takes a (const) reference parameter with another that takes an rvalue reference parameter in the same position:

void g(X const& a) {}    // doesn't mutate argument
void g(X&& a) { modify(a); } // new overload; logically non-mutating

The second overload of g can modify its argument without changing the meaning of the rest of the program as long as it otherwise has the same semantics as the first overload.

Binding And Overloading

The complete C++0x rules for reference binding and overload resolution are summarized in the table below:

Expression→
Reference Type↓
T
rvalue
const T
rvalue
T
lvalue
const T
lvalue
Priority
T&& X 4
const T&& X X 3
T& X 2
const T& X X X X 1

The “Priority” column describes how these references behave with respect to overload resolution. For example, given the following overload set:

void f(int&&);        // #1
void f(const int&&);  // #2
void f(const int&);   // #3
passing an rvalue of type const int to f will invoke overload #2, because overload #1 represents a disallowed binding and #3 appears in a row with lower priority.

Declaring a Movable Type

We can use this idiom to make rvalues of any type implicitly movable using two new operations, move construction and move assignment, that take rvalue reference arguments. For example, a movable std::vector might look like this in C++0x:

template <class T, class A>
struct vector
{
    vector(vector const& lvalue);            // copy constructor
    vector& operator=(vector const& lvalue); // copy assignment operator
    vector(vector&& rvalue);                 // move constructor
    vector& operator=(vector&& rvalue);      // move assignment operator};

It’s the job of a move constructor or assignment operator to “steal” resources from its argument, leaving that argument in a destructible and assignable state.4

In the case of std::vector, that probably means returning the argument to its empty state. A typical implementation of std::vector contains three pointers: one to the beginning of allocated storage, another that points to the the end of the elements in the vector, and a third that points to the end of allocated storage. So, assuming these pointers are all null when the vector is empty, the move constructor might look something like this:

vector(vector&& rhs) 
  : start(rhs.start)                // adopt rhs's storage
  , elements_end(rhs.elements_end)
  , storage_end(rhs.storage_end)
{    // mark rhs as empty.
     rhs.start = rhs.elements_end = rhs.storage_end = 0;
}

And the move assignment operator might go this way:

vector& operator=(vector&& rhs)
{ 
    std::swap(*this, rhs);
    return *this;
}

Since the rvalue argument is going to be destroyed anyway, swapping not only acquires its resources but essentially “schedules” our current resources for later destruction.

Note:

Don’t get too comfortable yet; this move assignment operator is still not quite right.

Rvalue References and Copy Elision

Move construction of std::vector is extremely cheap (approximately 3 loads and 6 stores to memory), but it’s not free. Fortunately, the standard is written so that copy elision (which is truly free) can take priority over move operations. When you pass an rvalue by value, or return anything by value from a function, the compiler first gets the option to elide the copy. If the copy isn’t elided, but the type in question has a move constructor, the compiler is required to use the move constructor. Lastly, if there’s no move constructor, the compiler falls back to using the copy constructor.

For example:

A compute()
{
    A v;return v;
}
  1. If A has an accessible copy or move constructor, the compiler may choose to elide the copy
  2. otherwise, if A has a move constructor, v is moved
  3. otherwise, if A has a copy constructor, v is copied
  4. otherwise, a compile time error is emitted.

Therefore, the guideline from the previous article still applies:

Guideline

Don’t copy your function arguments. Instead, pass them by value and let the compiler do the copying.

In light of the above guideline, you may now be asking yourself, “aside from move constructors and assignment operators, where would I use the rvalue overloading idiom? Once all my types are movable, what is left to be done?” Examples follow.

Moving From Lvalues

All these move optimizations have one thing in common: they occur when we’re through using the source object. Sometimes, though, we need to give the compiler a hint. For example:

1
2
3
4
5
6
7
8
9
void g(X);
 
void f()
{
    X b;
    g(b);
    …
    g(b);
}

In line 8, we call g with an lvalue, which is ineligible for resource-stealing—even though we’re never going to use b again. To tell the compiler that it can move from b, we can pass it through std::move:

1
2
3
4
5
6
7
8
9
void g(X);
 
void f()
{
    X b;
    g(b);              // still need the value of b
    …
    g( std::move(b) ); // all done with b now; grant permission to move
}

Note that std::move doesn’t itself do any moving. It merely converts its argument into an rvalue reference so that move optimizations can kick in if it is used in a “move-optimized” context. When you see std::move, you should think: “grant permission to move. You can also think of std::move(a) as a descriptive way to write static_cast<X&&>(a).

Boogie Shuffle

Now that we have a way to move from lvalues, we can optimize the insertion_sort algorithm from a few segments back:

template  
void insertion_sort(Iter first, Iter last) 
{ 
    if (first == last) return; 

    Iter i = first; 
    while (++i != last)     // Invariant: [first, i) is sorted 
    { 
        Iter next = i, prev = i; 
        if (*--prev > *i)
        {
            typename std::iterator_traits::value_type 
              x( std::move(*next) );
            do *next = std::move(*prev);
            while(--next != first && *--prev > x);
            *next = std::move(x);
        }
    }
}
Line 12: move first unsorted element to temp location

Line 12: move first unsorted element to temp location

Line 13: move last sorted element forward

Line 13: move last sorted element forward

Line 13: keep moving forward

Line 13: keep moving forward

Line 15: move-assign temp into position

Line 15: move-assign temp into position

Formatting aside, the only difference between this version of the algorithm and the previous one is the addition of calls to std::move. It's worth pointing out that we only need this one implementation of insertion_sort, regardless of whether the element type has a move constructor. This is typical of move-enabled code: the design of rvalue references allows a "move if you can; copy if you must" approach.

Movin' On

That's all for now, but this series will be back (soon, I promise---the material is already written!) to cover rvalue ressurection, exception-safety, perfect forwarding, and more. Oh, yes: and we'll tell you how to write vector's move assignment operator. Ta-ta till then!


Please follow this link to the next installment.


  1. If it seems strange to you that insertion sort is mentioned in the same breath as efficiency, take a look at standard library's implementation of the sort algorithm. You'll probably see that it uses an insertion sort when the sequence to be sorted is short enough. 

  2. Section 3.7 of Bjarne Stroustrup's excellent book, The Design and Evolution of C++ explains the motivation for this rule. 

  3. In early versions of the rvalue reference specification, rvalue references were allowed to bind to lvalues, so if you are experimenting with an early implementation of rvalue references such as the one in GCC 4.3 or 4.4, watch out for this gotcha. 

  4. In general, a moved-from object should be valid and consistent under all normal operations, but the abilities to destroy and assign to the object are a bare minimum for use with the STL. 

Posted Friday, September 11th, 2009 under Value Semantics.

34 Responses to “Move It With Rvalue References”

  1. Kaisha says:

    I see this is an older thread, but I have a question pertaining to this so we’ll try it here ;)

    I played around with rvalue references a while ago, but only recently tried to integrate them into existing code. I have a message passing library I’ve used for some time now and thought that move semantics were perfect for it (as it basically just forwards data around). Unfortunetly I’ve come across a problem. The library dispatches based on type, so you can do something like:

    int a = 5; 
    float b = 10.0f; 
    obj << Message(a,b);

    It will send the message to the object, which will forward it to obj’s appropriate member function for processing. And the Message function is similar to what you would expect, something like:

    template <typename T1, typename T1, ... , typename TN>
    Data<T0,T1, … , TN> 
    Message (const T0& p0, const T1& p1, … , const TN& pN) {}

    I used the boost pre-processor library to create the N instances (sorry, not trying to bore you guys, just trying to be clear). Here-in lies the problem, when I’m defining messages, I need to define the type to ensure dispatch is handled on the proper type (ie. like when passing pointers to polymorphic classes). Now that’s not always the case, but it was the case enough of the time that I defined all the messages like:

    auto MessageSomethingOrAnother = Message<int,float,Obj*>; 
    obj << MessageSomethingOrAnother(5,6,this);

    Makes it clear and easy to catch errors. But now the magical r-value template rules don’t apply, and I can’t make use of move semantics. In fact (as far as I can tell) I’d require 2^n versions of Message just to handle the situation, which kinda defeats one of the major reasons why move semantics were put in, in the 1st place. Now of course I could use:

    obj << Message(static_cast<int>(5),static_cast<float>(6),static_cast<Obj*>(this));

    But that’s certainly harder to work with. Am I missing something here (and I apoligize if I am, I’m certainly no guru)? Is there a way to have perfect forwarding without requiring all the template parameters of a function to be automatically deduced? This seems like a bit of an oversight to me.

      Quote
    • Kaisha says:

      Sorry, messed up the formatting, the above code sections should read:

       
      int a = 5; 
      float b = 10.0f; 
      obj << Message(a,b);
       
      ...
       
      template <typename T1, typename T1, ... , typename TN> Data<T0,T1, … , TN> Message (const T0& p0, const T1& p1, … , const TN& pN) { … }
       
      ...
       
      auto MessageSomethingOrAnother = Message<int,float,Obj*>; 
      obj << MessageSomethingOrAnother(5,6,this);
       
      ...
       
      obj << Message(static_cast<int>(5),static_cast<float>(6),static_cast<Obj*>(this));
       
        Quote
  2. Vincent says:

    There is a subtlety that confuses me: in N2812, and in gcc implementation, the function move is defined as:

    template<typename _Tp>
        inline typename std::remove_reference<_Tp>::type&&
        move(_Tp&& __t)
        { return static_cast<typename std::remove_reference<_Tp>::type&&>(__t); }

    However, as far as I understand, the whole point of N2812 is that only rvalues should bind to T&&. Then only rvalues will bind to std::move(T&&) as declared above. But if std::move can only be passed rvalues, what’s the point of calling it in the first place ?!

      Quote
    • You’re missing the subtlety of the template argument deduction and reference-collapsing rules we describe in a later article. Only rvalues can bind to rvalue references, but when the argument is an lvalue, T is deduced to be U& and T&& is therefore also equivalent to U&, i.e. it is an lvalue reference.

        Quote
  3. Christoph Heindl says:

    The link to the third installment is dead. It should refer to

    http://cpp-next.com/archive/2009/09/making-your-next-move/

    I guess.

      Quote
  4. Doug Gregor says:
    Niels Dekker: You’re welcome, Doug. I read your proposed change of [basic.lval], which would make sure that that particular sentence does not apply when binding an rvalue reference. Cool! Just for my understanding: In what other context would an lvalue still be converted implicitly to an rvalue?

    It happens a lot with built-in types. For example, given:

    int doubleMe(int x) { return 2 * x; }

    The reference to “x” is an lvalue, but it is converted to an rvalue to perform the multiplication.

      Quote
    • Niels Dekker says:
      Doug Gregor wrote: The reference to “x” is an lvalue, but it is converted to an rvalue to perform the multiplication.

      Thanks, Doug. But do you mean that if [basic.lval]/7 would not have been there, one would have to convert an lvalue argument of an int multiplication explicitly? As follows?

      int doubleMe(int x) { return 2 * std::move( x ) ; }
        Quote
  5. Pavel Shevaev says:

    Dave, thanks a lot for the article(and this site in particular)! Would you mention any “move” emulation libraries in your series of articles for those unfortunate souls using non c++0x compatible compiler ? :)

      Quote
  6. Doug Gregor says:
    Niels Dekker: Thanks, Bo. So do you think that that particular quote is still to be removed from the next Standard? (”Whenever an lvalue appears in a context where an rvalue is expected, the lvalue is converted to an rvalue”, [basic.lval].) I wonder why N2844 (the proposed wording by Douglas and David) didn’t propose to remove the quote. Note that N2844 has already been applied to the Working Draft, N2914.

    When writing the wording in N2844, I missed that paragraph in [basic.lval]. It’s clearly wrong (now); I’ll try to get it corrected in an upcoming draft. Thanks!

      Quote
  7. nasos_i says:

    And a sort question before I dip my mind into that: Has anybody dealt with overload combinatorial demands when more than one arguments need to pass in a function? i.e.: f(X const &a, X const &b); f(X const &a, X&& b); f(X&&a, X const &b); f(X&&a, X&& b); …. for more parameters. or you think it will be easy through a traits mechanism? With copy elision do the compilers produce code picking an optimum combination? Thanks in advance!

      Quote
    • Hi Nasos,

      The next article in this series will have a chart of binding/overloading rules that can help answer your question, and we’ll also cover “perfect forwarding,” which addresses the combinatorics issue. As for copy elision, it doesn’t really apply unless you’re passing things by value, and I only see references in your examples there. Would you like to clarify a bit?

        Quote
      • Nasos Iliopoulos says:

        Sorry, I was not as specific as I could.

        Let’s for example say that we have those functions:

        f(someType a, someType b);
        someType g();

        And let’s say that we call f that way:

        someType c;
        f(c, g());

        I am asking if the compiler will copy c into f and pass the output of g with copy elision, or it will do something else.

          Quote
      • Dave Abrahams: Hi Nasos, The next article in this series will have a chart of binding/overloading rules that can help answer your question

        Actually I realized that section belonged in this article, so I added it as an update. Search for “Binding and Overloading”. Cheers!

          Quote
  8. Nasos Iliopoulos says:

    Thanks for article! Was waiting impatiently for such an insight!

      Quote
  9. Rodrigo says:

    BTW, is what #3 says true?. I was under the impression that lvalues were the ones that could bind rvalues (not anymore)

     
    f(int&&);
     
    int main( )
    {
        int a;
        f(a);  // allowed in the past
    }
     

    but not the other way around.

      Quote
    • Rodrigo: BTW, is what #3 says true?

      Yes. N2812 describes the rationale for this change, and N2844, which provides the new standard wording, was accepted.

      I was under the impression that lvalues were the ones that could bind rvalues (not anymore)

      I can’t quite parse the above, but…

      f(int&&);
       
      int main( )
      { int a; f(a);// allowed in the past
      }
       
      but not the other way around.

      …yes, before N2844 was accepted, the above was allowed. It is not allowed any longer.

        Quote
      • Niels Dekker says:

        The C++ Standard says (Lvalues and rvalues, [basic.lval]):

        Whenever an lvalue appears in a context where an rvalue is expected, the lvalue is converted to an rvalue

        It’s still in the latest Working Draft, N2914. Can you please explain why this quote does not apply to f(a), in the code example by Rodrigo? Is this quote outdated by now, given the acceptance of your N2844?

        Thanks in advance, Niels

          Quote
        • Bo Persson says:

          N2914 represents the state of C++ before the latest meeting.

          There are A Lot of changes to be made to that document, including the removal of concepts, and adding what was accepted at the meeting. The result of this work has not been published yet.

            Quote
          • Niels Dekker says:

            Thanks, Bo. So do you think that that particular quote is still to be removed from the next Standard? (“Whenever an lvalue appears in a context where an rvalue is expected, the lvalue is converted to an rvalue”, [basic.lval].) I wonder why N2844 (the proposed wording by Douglas and David) didn’t propose to remove the quote. Note that N2844 has already been applied to the Working Draft, N2914.

              Quote
  10. Rodrigo says:

    in the Stealing Resources example, cache = a must be cache = std::move(a). Once a variable is named, it is an lvalue.

      Quote
  11. Eelis says:

    That operator= should return *this.

      Quote

Leave a Comment (post replies using links below individual comments)

Spam Protection by WP-SpamFree

Subscribe without commenting