Be honest: how does the following code make you feel?
std::vector<std::string> get_names(); … std::vector<std::string> const names = get_names();
Frankly, even though I should know better, it makes me nervous. In
principle, when get_names() returns, we have to copy a vector of
strings. Then, we need to copy it again when we initialize
names, and we need to destroy the first copy. If there are N
strings in the vector, each copy could require as many as N+1 memory
allocations and a whole slew of cache-unfriendly data accesses as the
string contents are copied.
Rather than confront that sort of anxiety, I’ve often fallen back on pass-by-reference to avoid needless copies:
get_names(std::vector<std::string>& out_param ); … std::vector<std::string> names; get_names( names );
Unfortunately, this approach is far from ideal.
- The code grew by 150%
- We’ve had to drop
const-ness because we’re mutatingnames. - As functional programmers like to remind us, mutation makes code more complex to reason about by undermining referential transparency and equational reasoning.
- We no longer have strict value semantics1 for
names.
But is it really necessary to mess up our code in this way to gain efficiency? Fortunately, the answer turns out to be no (and especially not if you are using C++0x). This article is the first in a series that explores rvalues and their impliciations for efficient value semantics in C++.
RValues
Rvalues are expressions that create anonymous temporary objects. The name rvalue refers to the fact that an rvalue expression of builtin type can only appear on the right-hand side of an assignment. Unlike lvalues, which can always be used on the left-hand-side of an assignment, rvalue expressions yield objects without any persistent identity to assign into.2
The important thing about anonymous temporaries for our purposes, though, is that they can only be used once in an expression. How could you possibly refer to such an object a second time? It doesn’t have a name (thus, “anonymous”); and after the full expression is evaluated, the object is destroyed (thus, “temporary”)!
Once you know you are copying from an rvalue, then, it should be
possible to “steal” the expensive-to-copy resources from the source
object and use them in the target object without anyone noticing. In
this case that would mean transferring ownership of the source vector’s
dynamically-allocated array of strings to the target vector. If we
could somehow get the compiler to execute that “move” operation for
us, it would be cheap–almost free–to initialize names from a
vector returned by-value.
That would take care of the second expensive copy, but what about the
first? When get_names returns, in principle, it has to copy the
function’s return value from the inside of the function to the
outside. Well, it turns out that return values have the same property
as anonymous temporaries: they are about to be destroyed, and won’t be
used again. So, we could eliminate the first expensive copy in the
same way, transferring the resources from the return value on the
inside of the function to the anonymous temporary seen by the caller.
Copy Elision and the RVO
The reason I kept writing above that copies were made “in principle” is that the compiler is actually allowed to perform some optimizations based on the same principles we’ve just discussed. This class of optimizations is known formally as copy elision. For example, in the Return Value Optimization (RVO), the calling function allocates space for the return value on its stack, and passes the address of that memory to the callee. The callee can then construct a return value directly into that space, which eliminates the need to copy from inside to outside. The copy is simply elided, or “edited out,” by the compiler. So in code like the following, no copies are required:
std::vector<std::string> names = get_names();
Also, although the compiler is normally required to make a copy when a function parameter is passed by value (so modifications to the parameter inside the function can’t affect the caller), it is allowed to elide the copy, and simply use the source object itself, when the source is an rvalue.
1 2 3 4 5 6 7 8 9 10 11 12 | std::vector<std::string> sorted(std::vector<std::string> names) { std::sort(names); return names; } // names is an lvalue; a copy is required so we don't modify names std::vector<std::string> sorted_names1 = sorted( names ); // get_names() is an rvalue expression; we can omit the copy! std::vector<std::string> sorted_names2 = sorted( get_names() ); |
This is pretty remarkable. In principle, in line 12 above, the
compiler can eliminate all the worrisome copies, making
sorted_names2 the same object as the one created in get_names().
In practice, though, the principle won’t take us quite that far, as
I’ll explain later.
Implications
Although copy elision is never required by the standard, recent versions of every compiler I’ve tested do perform these optimizations today. But even if you don’t feel comfortable returning heavyweight objects by value, copy elision should still change the way you write code.
Consider this cousin of our original sorted(…) function, which takes
names by const reference and makes an explicit copy:
std::vector<std::string> sorted2(std::vector<std::string> const& names) // names passed by reference { std::vector<std::string> r(names); // and explicitly copied std::sort(r); return r; }
Although sorted and sorted2 seem at first to be identical, there
could be a huge performance difference if a compiler does copy
elision. Even if the actual argument to sorted2 is an rvalue, the
source of the copy, names, is an lvalue,3 so the copy can’t
be optimized away. In a sense, copy elision is a victim of the
separate compilation model: inside the body of sorted2, there’s no
information about whether the actual argument to the function is an
rvalue; outside, at the call site, there’s no indication that a copy
of the argument will eventually be made.
That realization leads us directly to this guideline:
Guideline: Don’t copy your function arguments. Instead, pass them by value and let the compiler do the copying.
At worst, if your compiler doesn’t elide copies, performance will be no worse. At best, you’ll see an enormous performance boost.
One place you can apply this guideline immediately is in assignment operators. The canonical, easy-to-write, always-correct, strong-guarantee, copy-and-swap assignment operator is often seen written this way:
T& T::operator=(T const& x) // x is a reference to the source { T tmp(x); // copy construction of tmp does the hard work swap(*this, tmp); // trade our resources for tmp's return *this; // our (old) resources get destroyed with tmp }
but in light of copy elision, that formulation is glaringly inefficient! It’s now “obvious” that the correct way to write a copy-and-swap assignment is:
T& operator=(T x) // x is a copy of the source; hard work already done { swap(*this, x); // trade our resources for x's return *this; // our (old) resources get destroyed with x }
Reality Bites
Of course, lunch is never really free, so I have a couple of caveats.
First, when you pass parameters by reference and copy in the function body, the copy constructor is called from one central location. However, when you pass parameters by value, the compiler generates calls to the copy constructor at the site of each call where lvalue arguments are passed. If the function will be called from many places and code size or locality are serious considerations for your application, it could have a real effect.
On the other hand, it’s easy to build a wrapper function that localizes the copy:
std::vector<std::string> sorted3(std::vector<std::string> const& names) { // copy is generated once, at the site of this call return sorted(names); }
Since the converse doesn’t hold—you can’t get back a lost opportunity for copy elision by wrapping—I recommend you start by following the guideline, and make changes only as you find them to be necessary.
Second, I’ve yet to find a compiler that will elide the copy when a
function parameter is returned, as in our implementation of sorted.
When you think about how these elisions are done, it makes sense:
without some form of inter-procedural optimization, the caller of
sorted can’t know that the argument (and not some other object) will
eventually be returned, so the compiler must allocate separate space on
the stack for the argument and the return value.
If you need to return a function parameter, you can still get near-optimal performance by swapping into a default-constructed return value (provided default construction and swap are cheap, as they should be):
std::vector<std::string> sorted(std::vector<std::string> names) { std::sort(names); std::vector<std::string> ret; swap(ret, names); return ret; }
More To Come
Hopefully you now have the ammunition you need to stave off anxiety about passing and returning nontrivial objects by value. But we’re not done yet: now that we’ve covered rvalues, copy elision, and the RVO, we have all the background we need to attack move semantics, rvalue references, perfect forwarding, and more as we continue this article series. See you soon!
Follow this link to the next installment.
Acknowledgements
Howard Hinnant is responsible for key insights that make this article series possible. Andrei Alexandrescu was posting on comp.lang.c++.moderated about how to leverage copy elision years before I took it seriously. Most of all, though, thanks in general to all readers and reviewers!
-
Googling for a good definition of value semantics turned up nothing for me. Unless someone else can point to one (and maybe even if they can), we’ll be running an article on that topic—in which I promise you a definition—soon. ↩
-
For a detailed treatment of rvalues and lvalues, please see this excellent article by Dan Saks ↩
-
Except for
enums, every value with a name is an lvalue. ↩

So what’s wrong with the following (note the &)?
(Quote)I think you meant
Assuming get_names returns by value (and not a reference) and you’re dealing with a “good” compiler there should not be a difference between the two. In the first case (#1) the compiler can make the function construct its return value directly in that space that will be referred to by the name “names”. In #2 the return value that lives on the stack gets an extension of its life-time and a reference to it is created. A good compiler doesn’t even need to allocate space for the reference but I’m not sure if most compilers are that smart (even though they could be).
If get_names returns a reference it makes a big difference, of course. Is the following code safe?
string const& blah = string("123") + "456";Currently, it is safe. It’s not safe C++0x (according to the current draft) because operator+(string&&,char const*) returns an rvalue reference –> You get a dangling reference.
In my opinion, people should not write #2 instead of #1 just because they think they can safe a copy. Many current compilers successfully elide the copy in #1. Also, they should not declare functions that return rvalue references (I really hope operator+(string&&,char const*) and others will be fixed) for “temporary recycling” because it opens up the possibility of dangling references.
Cheers! SG
(Quote)Trick question? It’s a syntax error: you can’t cv-qualify the reference itself.
(Quote)Thanks for the article, Dave!
A very minor remark: your recommended copy-and-swap assignment implementation cannot do a fast assignment to itself. Fast self-assignment could be achieved by adding an extra check to the “canonical” version:
T& T::operator=(T const& x) {
}
Of course, the speed of self-assignment is rarely relevant. But I find it slightly counter-intuitive, having a self-assignment that might fail! But if a user doesn’t even have enough memory to assign something to itself, she’s probably in deep trouble anyway!
(Quote)…which would penalize the usual case and complicate the code in order to optimize a rare case, which is almost always a bad idea.
These self-assignments never have the form
(Quote)x = xanyway (nobody knowingly does a self-assignment except in test suites). They’re almost alwaysx = ycases, where x and y may or may not refer to the same object. That means we’re in code that has to cope with an exception anyhow. There’s really zero advantage in making self-assignment a no-throw operation.Great article, but could you also include a conclusion with an example of what we should do and not do (i.e. best practices?). Your article seems more like a discussion than a “good rule to follow”, which makes it difficult to pick out the important parts. But definitely thanks for the article. Keep posting more.
(Quote)Thanks for the feedback! Speaking generally, I think the popular C++ literature is long on rules and short on insight, and I’m not naturally inclined to boil things down to a prescription, so it’s great to know when the important stuff doesn’t stand out.
But it’s just a guideline; as I explain in this comment, you’ll end up generating bigger code if you often pass lvalues that way, and bigger code, in some circumstances can be slower code—or it can just be unacceptable because of its size. So maybe you can see why I don’t dispense a lot of rules.
(Quote)That guideline appears a bit wrong. For example, with that kind of code
void vector::push_back(const T& t) { ... new(buffer) T(t); ... }It would be silly to pass t by value.
So I believe it should be something like:
(Quote)See my reply here.
(Quote)Or rather, pass by value any arguments that you would otherwise copy explicitly and which you don’t want to control the storage of the copy of.
Indeed, for something like vector::push_back, passing by value is useless, even though you want to explicitly copy the argument.
(Quote)It’s not entirely useless if you know something about the type being passed, like how to move-construct it or that it can be emulated cheaply with default construct and swap
(Quote)It seems that your new style copy assignment operator don’t mix well with rvalue reference in MSVC10 and in gcc 4.4 :
Hold your horses, people! All (well, much) will be revealed when we cover rvalue refs in the next installment.
(Quote)Hi, This is a couple of unconnected thoughts that hauted me after reading your article.
(1) The copy-and-swap idiom. It implements a copy in terms of swap. Then, the default swap function is implemented in terms of three copies. It is all fine if you always implement customized swap for your classes, but if you don’t you are risking infinite recursion. Wouldn’t it be safer if the idiom was always accompained by the note saying that you need to implement swap too?
(2) In C++0x we will have lval references, rval references, and values (no-references). The Cartesian product with const/non-const qualifier gives 6 possible ways in which we can define function arguments: 1. fun( YourClass v ); // by value 2. fun( const YourClass v ); // const copy. useless?? 3. fun( YourClass & v ); // output parameter 4. fun( YourClass const& v ); // sort of by value, bot no copying 5. fun( YourClass && v ); // temporary that I can change 6. fun( YourClass const&& v ); // useless – #4 would do
Numbers 2 and 4 are probably useless, but it still leaves us with four. Even if we do not want to talk about rval refs right now it leaves us with three, which is one too much. Having spent a number of years programming in C++ I do not find in strange any more, but if you look at it from a new commer perspective there should be only two: either I will change your object, or not. I think Andrei Alexandrescu pointed that out somewhere in the discussion groups. It could be only #1 (interested in value) and #3 (interested in an object – memory location). The other two (4 and 5) are just for performance tweeks, aren’t they? Well, #5 is also about unique ownership, but half of it’s job is still performance, isn’t it? It is troublesome that you have four choices, and after your article, it is clear that it is not clear which one to choose. We may think taht we optimize, but in fact we inadvertently pessimize. If we have only two options and teh support of copy elision, move semantics, and perhaps something newer and even more powerful, we could just write:
fun( set<vector> data );
and be sure that we never add any slowdown. I am not surewhat point I am trying to make here, but Im pretty sure I want to make some point. The two functions below have different argument type. Trying to pick one when matching overload candidates would be ambiguous, but they are still two types. 1. fun( YourClass v ); 2. fun( YourClass const& v ); Why do we need #2? Because it is sometimes faster. Why do we need #1? Not sure. Because it is sometimes faster? Perhaps we do not need #1 at all? If we discard it we can change the syntax of #2 to
fun( YourClass v );
This is the same as #1 used to be, but since we discarded it there is no ambiguity. If there is some programmer’s knowlege required to perform optimization, shouldn’t it rather be provided via attributes:
fun( YourClass [[copy]] v );
But do we need even that? Is the compiler simply not smarter that us?
(4) Value semantics. It has a great suppot in C++, e.g. in form of a implicitly defined copy constructor/assignment. A couple of things were suggested to the Standards Committe to make it even better. I just wanted to list them here: 1. Implicitly defined comparison operator (that is a logical conjunction of member-wise comparisons). It was mentioned in N2326, but never really proposed 2. The definition of “the same”. In N2479. It has a status “Outstanding issues” – not sure what it means. 3. Not generating copy operations implicitly for classes with non-trivial destructors. Proposed in N2904. No idea what its status is.
Regards.
(Quote)Heh, just a couple?
This is a whole article unto itself! Thanks for your contribution; I may have to respond in pieces.
An assignment, I think.
A copy and two assignments, but…
…point taken.
I’ll try to get to the rest of your material soon, but let me say now that a lot of what you’ve written sounds a lot like thoughts I‘ve been having lately. I ask the question this way: what would a language that was designed to support mutable value semantics look like?
(Quote)(1) That’s why it’s customary to use the member swap in the copy-and-swap idiom:
T& operator =(T t) { t.swap(*this); return *this; }
No chance of infinite recursion, unless you implemented member swap in the canonical way, and that would be stupid.
(2) Const rvalue references are useless to the programmer and should never appear written in a program. (They can appear through template deduction.) Const by-value arguments are also useless, so as you say, we’re down to 4 variants. Let’s leave rvalue references out for the moment. We deal with by-ref, by-val and by-const-ref. Combining the usual C++ wisdom with this article pretty much leads to these guidelines: - Unconditionally use by-ref is for out or in-out parameters. However, reconsider whether you need out parameters, because returning might be just as efficient. - Unconditionally use by-val for arguments that are cheap to copy (primitives) and by-const-ref for arguments that are not copyable. - This leaves arguments that are copyable, but it is expensive to do so. This article essentially says that you should pass these by-val iff you plan to modify them inside the function, but don’t want the modifications to be visible outside. The downside of this approach is that you leak an implementation detail into the interface: if you have a traditional assignment operator, you should pass the argument by-const-ref, but if you convert it to a copy-and-swap assignment operator, you should change it to by-val.
(Quote)Your guidelines are clear and fine. But what I was writng about was more fantasizing how a more perfect language could look like. In fact, one aspect could be achieved only by even more advanced compiler optimization technique. As I have little (i.e. none at all) familiarity with compilers, I may still be really fantasizing, but just consider:
FatCopy fc = prepare(); read1( fc ); read2( fc );
Two read functions do not alter the parameter; we are used to declaring:
void read1( FatCopy const & fc );
This is in order to avoid copying. This, in turn, is because we are used to think that:
void read1( FatCopy fc );
means copying. The way I have been taught C++, one thinks that the above line means passing data by copying, unless copy elision is employed. How about changing our thinking to “passing data by value”: there will be no copying, unless the function really changes value fc and copying is really unavoidable. The compiler will decide to use your copy constructor only if necessary, and in situations where you would need a copy anyways. This could be achieved as follows: The compiler always compiles
void read1( FatCopy fc );
as passing by reference, and if it finds that fc is modified by read1, it marks it “somehow” this fact in that function’s meta data, and later, it adds a copy at the call site while linking, so the calling function would be compiled to:
FatCopy fc = prepare(); FatCopy __copy = fc; read1( __copy ); read2( fc );
The copy is only if read1 modifies fc. Otherwise there is no copy whatsoever.
Regards.
(Quote)Maybe, but it’s probably not as scary as you think. Outside
(Quote)namespace std, an unqualified call toswapwon’t findstd::swapunless you’ve explicitly brought it into scope with a using-declaration.Dave, et al. Thanks much for this site and the material so far.
(Quote)Nice to hear from you, Brad! And, you’re most welcome. Thanks for being a part of what has been a very gratifying response so far.
(Quote)One consideration about the following:
In principle, the fact that the function makes a copy is an implementation detail, which is invisible if you use
const vector&. When you switch to pass-by-value, you are effectively exposing your implementation to the interface of the function, so in case you later come up with another solution that doesn’t involve copying, you’ll switch to reference with the obvious signature change, that at least will require your clients to recompile (fortunately, their code won’t change unless they are using the signature explicitly, say as a function pointer). Without this, they would just replace .so/.dll with the new version.It’s not a major problem, just a point to consider. After all, almost no explicit optimization comes at zero price. (Btw, if elimination of reference-bound temporaries was allowed, it would work in this case as well!)
(Quote)First, yeah this is just another manifestation of the code size issue explained in this comment.
However #1, I have a hard time seeing a switch to pass-by-value as exposing an implementation detail. I’m not sure why; it may just be a gut reaction, but my orientation is toward pass-by-value as a default, and pass-by-reference as an optimization. Nothing obliges the function to modify or steal resources from the copied value, and the copy ought to be side-effect-free. Okay, I’ll admit I’m flailing about in the dark hoping to hit the right explanation for why it’s not an implementation detail. Let me give that some thought.
However #2, I totally disagree that legalizing your optimization—which I won’t even call “elimination of reference-bound temporaries” because you surely would not want that to be allowed in all cases—would help with the recompilation problem. It’s like I said earlier: your optimization depends on being able to see inside the called function, which is in conflict with the separate compilation model.
(Quote)Hi, Your “However #1″ is somehow very inspiring. When I type the function:
int double1( int const& i ) { return 2*i; }
and then change it to:
int double2( int i ) { return 2*i; }
No-one will say taht I exposed any imlementation detail. I just want a value. In fact I would probably never write double1, because double2 is so natural: “just give me this integral value”. double1, on the other hand says: “I will take it by reference”. Imagine a function call:
return double2(5);
Who is interested in knowing that you will have yet another name to refer to the 5? I just want to double it. Now, ‘const’ means “I will not try to change it, so pas me literals and temporaries as well”. This is “weird” too. I wasn’t asking whether you would be changing it or not, I just wanted to double the number, but now I have to puzzle about whether someone will mutate my value or not. And in fact “int const&” somehow isn’t as much mutation-proof as “int” alone. You can cast away const, but you cannot cast the copy back onto the original.
Also the change from double1 to double2 only exposes a copy operation (and the destructor). Not any other function. Copy operation is special to that extent that compilers implement it for you for your own classes. Function:
void fun( YourClass cc );
Doesn’t expose any function of YourClass. It simply requires a value. Well, I know it is just some loose thoughts.
Regards.
(Quote)It seems you’re going to “blogify” the “RValue Reference 101″ article which is nice because it is currently inaccessible to users who don’t have a trac account at boostpro.
Cheers! Sebastian
(Quote)It started with that article, but it’s being extensively revised and expanded.
(Quote)Hi Dave,
Could you please comment on this note in the Standard (12.8/15) “when a temporary class object that has not been bound to a reference (12.2) …” Why there is this “has not been bound to a reference” constraint? Consider this (no idea how to get syntax-highlighted code here, a “how to” link would be great):
struct string { const char* str; string(const char* str) : str(str) { puts("constructor"); } string(const string& other) : str(other.str) { puts("copy constructor"); } }; struct holder_1 { string s; holder_1(const string& s) : s(s) {} }; int main() { holder_1("some string"); }Due to the constraint,
holder_1::scan’t be initialized directly from “some string”, and a temporary string will always be created.And this constraint appears to be in the latest draft as well.
Thanks
(Quote)Please see the new “posting” tab.
I don’t think I agree with your analysis of this example. The type of
(Quote)"some string"ischar[12], and it must be converted to astringin order to match the signature ofholder_1’s ctor in line 7 long beforeholder_1::sis initialized. There’s no opportunity to use the ctor in line 4 as far as I can tell. Am I missing something?Well, to me, copyctor elision is (conceptually) something like “get rid of creating a temporary if it’s being used only to initialize an object of the same type” (please correct me right here if I’m wrong)? and the wording in the standard is, well, just wording, which is subject to change (e.g. there are two more allowed cases in the latest C++0x draft comparing to 2003). To do the elision, compiler should look ahead anyway? to check the usage of the temporary. E.g. here:
string(const char*)->string(const string&), but the compiler “looks a bit further and sees” that the copyctor initializes from a temporary, and gets rid of it (of course, making all necessary checks about copyctor availability).I don’t see why it can’t be done here as well, as the compiler has all code, everything is inline etc. It’s an optimization, and the compiler should be clever enough to look ahead (and we know they are in many cases, like link-time optimization). But here it’s simply forbidden by the standard because the temporary has been bound to a reference. That’s why I’m asking why do we have this constraint.
(Quote)OK, I understand what you’re asking, though I’m not at all sure that eliminating the text in question would be enough to allow that particular optimization. For what it’s worth, your idea is in a completely different class from today’s copy elision, because the current optimizations only “look a bit further” at the call site, and don’t require looking inside the callee as yours does, which in principle is possible though it may be too late by link time, practically speaking (too much high-level information missing by then).
I don’t really know why we have that constraint (“see D&E” is my stock answer), but if I had to guess I’d say it was there to prevent lvalues from being mutated in scenarios like
If line 5 mutated
(Quote)athat would be pretty surprising.I read somewhere that it is because the committee agreed (for C++98) that some use-cases exist where monitoring the number of copies is useful. I’m completely opposed to this; the programmer that needs to monitor the number of copies created also needs to know the strange cases where the copy is allowed to be avoided.
(Quote)I would be very surprised if the particular optimization Maxim was asking for was ever considered, and even more surprised if it were ruled out on those grounds. That seems completely inconsistent with the intent of copy elision.
(Quote)Yes sorry Dave, I misread it.
However, Alex Stepanov in his notes and his latest book smartly points that if we could force construction, copying and equality to retain their expected semantics, the compiler could apply that optimization.
The committee, prefering freedom, didn’t enforce any semantics. It means that
struct s { s(const char*); s(const s&); };
bool operator==(const s&, const s&);
s a1 = “123″; s a2 = a1;
(a1 == a2); // could be false, the programmer could rely on it being false!
then, we simply aren’t allowed to rewrite s a2 = a1; to s a2 = “123″;
(Quote)I think you mean “the programmer couldn’t rely on it being true!” In any case, yes, Elements of Programming and regular types will definitely be topics in upcoming articles.
(Quote)a1 is not a temporary here, so it doesn’t apply. Here is my conceptual understanding for the copy elision:
In your example a1 is also used later in comparison (and it’s not a temporary object at all), so it can’t be eliminated.
Rodrigo, Dave, do you agree with this conceptual definition for the copy elision?
(Quote)In any case yes.
One thing I hate from C++ is the copy-ellision rules. It would be easier if the problem you point were surely caused by a weak compiler instead of a C++ rule.
While I’m not completely sure if your optimization is currently allowed in C++, for me it makes sense.
Btw my point was that a copy from an object could be different from an object created using the same initializer from the source.
(Quote)Well, I mentioned link time just to emphasize the power of present day optimizers. For the code in question, all analysis can be done in compile time. It’s optimization, which is never mandatory, so we should be ready for the cases when it doesn’t work (e.g. it falls to link time).
It looks like you have meant
const X& a = produce();, right? Something strange with the markup.Well (all the latter is not strictly according to the standard definitions),
ais not a “real temporary” here: the object returned byproduceis, but when you bind it to a reference that extends its lifetime, it’s not a temporary anymore in terms of elision, as you explicitly say “I want this object to live beyond the full expression it was created in”. OTOH, it applies to all other references as well (e.g. references in parameters), so probably they should be also subject to elision.As long as
ais used (and only used) to initialize a temporary (I believe this is the necessary precondition for the elision) in theconsume’s parameter, I see no problem with theaoptimized out.But this should probably also mean that we don’t rely on the destructor of
X(obvious usage is the Guard idiom).I need to think more about it, I still don’t have clear picture.
(Quote)I think there is another version of that. Consider:
void f() { try { prolly_failing(); } catch(Exception &e) { Exception e1(e); e1.modify(); // oops, e is modified potentially } }Since exception objects are temporaries too, the restriction about reference binding makes the irritating case of above not possible.
(Quote)Hi there. A similar example where we would mutate a temporary that has got a name:
void f() { try { dangerous(); } catch(Exception &e) { // (seemingly) copy it then (seemingly) modify // only the copy Exception copy(e); copy.enjoy(); throw; } }We would accidentally modify the exception object. I think the restriction of the reference binding will generally make it so we can’t refer to the temporary object a second time.
(Quote)Well I was trying to have a function like the sort in the article and see what happens when I call it. The memory allocation is here so i have costly things going on when copy ellision is not done.
Basically, is there a 2-3 classes examples that when compile tells : look here ellision, there no ellision.
(Quote)Try this example, whipped up at home with loving hands.
(Quote)Congrats to the cook then. This is utterly perfect. Tested with gcc 4.3 I only miss 1 ellision in the “Return rvalue passed by value” case. On 4.1, all ellisions fail. Gotta text MSVC.
Thansk again
(Quote)Ok. So I wanted to see if I can “checj” this things work with my current g++. So, I wrote that : http://codepad.org/craBkxXL
The output is, for gcc 4.3 : non-const call A new : 4 A delete : 6 A copy : 3
non-const call B new : 4 B delete : 6 B copy : 3
The non-const call is there to “emulate” the sort function from the article.
So, does this means the copy-stuff is done or do I don’t call the proper thing and, hence, not trigger the mecanism or is gcc 4.3 not copy-ellision aware (which I doubt) ?
I think I’m just not doing the correct thing to check this. So how should I butcher this so I can validate the use of copy ellision and show this to unbeliever co-worker ?
(Quote)Joel, it’s a little hard to tell what you’re trying to demonstrate with this example. It looks vaguely as though you’re trying to show something about const vs. non-const member functions, but that distinction doesn’t make any difference to copy elision, so maybe I’m misunderstanding. Try cutting it down to the absolute minimum (e.g. remove all that memory allocation stuff) and if you’re testing several different things, separate those tests as well.
(Quote)Very interesting article !
Sure, copy elision really mess with our C++ programmer’s “common sense”. It’s quite surprising to find a better way of writing assignment operator after all these years. There is certainly a lot of textbook to correct.
However, exploiting copy elision further than that, especially for return value, seems a bit fragile to me : 1) It’s hard to check if RVOs really take place, unless adding I/O in constructors. 2) Turn on debug mode and all these nice copy elisions disappear. (at least with MSVC)
“we have all the background we need to attack move semantics, rvalue references, perfect forwarding, and more as we continue this article series. See you soon!”
Great ! I’m really looking forward to seeing that. There is a lot of resource on move semantic out there, but it’s still very confusing to me. For example, I’ve yet to see a straightforward explanation of what is the correct and efficient way of writing functions like “sorted” in presence of move semantic.
std::vector//&& ? sorted(std::vector/&& ?/ names) { std::sort(names); return /std::move ?/names; }(Quote)Any side-effect will do; you can increment a counter.
I just did a quick test with the MSVC 2010 beta, and for that version you’re half right (elisions of arguments passed by value still happen in debug mode) update: see below . That’s a bit surprising, actually: a copied return value doesn’t help make debugging much easier (especially when argument copies are still getting elided), is likely to be misleading for those people looking for release-mode performance, could actually make debug-mode performance unacceptable even for testing, and it doesn’t take significantly more compile-time resources to do the elision. Once you implement RVO it seems like more work to leave a branch in the compiler where it’s disabled.
It is true that copy elision isn’t guaranteed by the standard, so vendors are free not to implement it, or to turn it off depending on compiler options. But is “fragile” really the right word? It’s not as though any vendor who implements copy elision can afford to break it in their next release. Also, given the lack of standard guarantees, I don’t see why you’d be more worried about exploiting return value elisions than argument copy elisions.
(Quote)Actually, I take it back; you’re a quarter right (sorry)! MSVC implements the URVO in debug mode (it will return an unnamed temporary without copying) but not the NRVO. Silly compiler; I’ve submitted a bug report, FWIW.
(Quote)And… it turns out not to be worth much at all: http://tinyurl.com/silly-compiler
(Quote)Hi Dave,
Thanks for the great article. I however have a slightly deeper question regarding rvalues, move semantics, and copy elision:
How do you ensure that the object passed by value is definitely copied and the copy is not elided by the compiler? Also, how then do you make sure that you really do have a copied object instead of a move-constructed object as a function parameter?
void foo(T t); // how to make sure t is copy constructed, instead of move-constructed?
(Quote)If you want to be sure to avoid copy elision (why would you want to do that?) then you need to pass an lvalue. Copies of lvalue function arguments are never elided.
Well, we haven’t even touched move construction yet, but as long as you’re bringing it up here, again I wonder why you’d want to do that? The answer is the same: pass an lvalue.
(Quote)Very interesting article. I tend to “const &” by default and this article reminds me that this is not as optimal as I thought.
(Quote)I’m intrigued on how we can check, for a given compiler at had, if it’s actually doing copy ellision and generate the proper code when we’re using such idioms. I guess adding I/O in the constructor don’t help as it’ll force the ctor to be called. Do you just check the assmebly output or what ?
Moreover, does this means that we can completely and forever dump the old pass-by-const-reference & copy operator= ?
(Quote)Actually, no! Copy elision is explicitly allowed even if the copy constructor and destructor have side-effects. I/O is a great way to see these effects in action.
Actually, yes! Dump it yesterday.
(Quote)Wow. I’ll give it a shot.Be sure i’ll update my C++ lessons to incorporate this
(Quote)Whoa, seriously?! Huge regression, if so (4.0 elides as expected). Are you sure you don’t have -fno-elide-constructors in the spec file or something?
(Quote)well i just did g++-4.1 ellide.cpp -o ellide I’ll check with 4.2 and 4.4
(Quote)So did you check the spec file? BTW, elision works with 4.3
(Quote)Could you elaborate further on this paragraph?
“First, when you pass parameters by reference and copy in the function body, the copy constructor is called from one central location. However, when you pass parameters by value, the compiler generates calls to the copy constructor at the site of each call where lvalue arguments are passed. If the function will be called from many places and code size or locality are serious considerations for your application, it could have a real effect.”
I don’t quite follow it. Good article though, thanks!
(Quote)@dvi: first, you’re most welcome, and thanks for asking for clarification; it helps to know when I fail to connect.
So here,
ftakes its argument by value, and does whatever it does… but it doesn’t copyabecause (copy elision aside) it already has a copy of whatever was actually passed.In this case, the compiler has to generate calls to
X’s copy constructor in the body ofgandhat lines 5 and 10. That’s a total of two calls. Probably not a big deal, but in some embedded applications, for example, there’s limited space available for code.Now compare with what happens when
ftakes its argument by reference and copies it:Now there’s just one call to
X’s copy constructor, in the body off. The exact same definitions ofgandhstill work, but callingfno longer involves copying at the call site.Does that help?
(Quote)Ah yes that makes it clear. Thanks again
(Quote)