Hi everybody. The bulk of this article is going to be a C++ Standards Committee document, but I’m posting it here because of its relevance to the article series on rvalue references we’ve been running here at C++Next. I’ve been promising a followup to that series, and having this article on the site is actually going to be useful when I write that followup.
But first, behold! The latest issue with rvalue references and move semantics…
The final version of this paper, as submitted to the committee, is N3153.
What is Implicit Move?
In an earlier
article, we
described a surprising effect that occurs when “legacy” C++03 types are
combined in the same object with move-enabled C++0x types: the combined
type could acquire a throwing move constructor. At that time, we didn’t
have a way to implement vector::push_back and a few other important
strong-guarantee operations in the presence of a throwing move.
Several independent efforts were made to deal with this problem. One approach, while not a complete solution, shrank the problem space considerably and had many other benefits: implicitly generate move constructors and move assignment operators when not supplied by the user, in much the same way that we do for copy constructors and copy assignment operators. That is the implicit move feature, and it was voted into the “working paper” (draft standard) this past spring.
Further Background
Another effort, spurred on by Rani Sharoni’s comments at C++Next finally yielded a complete, standalone solution to the problem, a form of which was eventually also adopted by the committee. Nobody attempted to “repeal” the implicit generation of move operations from the standard document, not least because of those “many other benefits” I alluded to earlier. But now we have a problem that we hadn’t anticipated.
The Problem
Back in August, Scott Meyers
posted
to comp.lang.c++ about a problem where implicit generation of move
constructors could break C++03 class invariants. For example, the
following valid C++03 program would be broken under the current C++0x
rules (and indeed, is broken in g++4.6 with --std=c++0x) by the
implicit generation of a move constructor:
#define _GLIBCXX_DEBUG #include <iostream> #include <vector> struct X { // invariant: v.size() == 5 X() : v(5) {} ~X() { std::cout << v[0] << std::endl; } private: std::vector<int> v; }; int main() { std::vector<X> y; y.push_back(X()); // X() rvalue: copied in C++03, moved in C++0x }
The key problem here is that in C++03, X had an invariant that its
v member always had 5 elements. X::~X() counted on that
invariant, but the newly-introduced move constructor moved from v,
thereby setting its length to zero.
Tweak #1: Destructors Suppress Implicit Move
Because rvalues are generally “about to be destroyed,” and the broken
invariant was only detected in X‘s destructor, it’s tempting to think
that we can “tweak” the current rules by preventing the generation of
implicit move constructors when a user-defined destructor is present.
However, the following example would still break (and breaks under
g++4.6 with --std=c++0x):
#define _GLIBCXX_DEBUG #include <algorithm> #include <iostream> #include <vector> struct Y { // invariant: values.size() > 0 Y() : values(1, i++) {} Y(int n) : values(1,n) {} bool operator==(Y const& rhs) const { return this->values == rhs.values; } int operator[](unsigned i) const { return values[i]; } private: static int i; std::vector<int> values; }; int Y::i = 0; int main() { Y ys[10]; std::remove(&ys[0], &ys[0]+10, Y(5)); std::cout << ys[9][0]; };
In C++03, there’s no way to create a Y with a zero-length values
member, but because std::remove is allowed to use move operations in
C++0x, it can leave a moved-from Y at the end of the array, and that
could be empty, causing undefined behavior in the last line.
std::remove in these examples»
Tweak #2: Constructors Suppress Implicit Move
It’s also tempting to think that at least in classes without a user-defined constructor, we could safely conclude that there’s no intention to maintain an invariant, but that reasoning, too, is flawed:
#define _GLIBCXX_DEBUG #include <iostream> #include <vector> // An always-initialized wrapper for unsigned int struct Number { Number(unsigned x = 0) : value(x) {} operator unsigned() const { return value; } private: unsigned value; }; struct Y { // Invariant: length == values.size(). Default ctor is fine. // Maintains the invariant void resize(unsigned n) { std::vector<int> s(n); swap(s,values); length = Number(n); } bool operator==(Y const& rhs) const { return this->values == rhs.values; } friend std::ostream& operator<<(std::ostream& s, Y const& a) { for (unsigned i = 0; i < a.length; ++i) std::cout << a.values[i] << " "; return s; }; private: std::vector<int> values; Number length; }; int main() { std::vector<Y> z(1, Y()); Y a; a.resize(2); z.push_back(a); std::remove(z.begin(), z.end(), Y()); std::cout << z[1] << std::endl; };
In this case, the invariant that length == values.size() was
established by the well-understood default-construction behavior of
subobjects, but implicit move generation has violated it.
Tweak #3: Private Members Suppress Implicit Move
It’s also tempting to think that we can use private members to
indicate that an invariant needs to be preserved; that a “C-style
struct” is not encapsulated and has no need of protection. But the
members of a privately-inherited struct are effectively encapsulated
and private with respect to the derived class. It is not uncommon to
see members moved into an implementation detail struct, which is then
used as a base class:
// Modified fragment of previous example. Replace the definition // of Y with this code. namespace detail { struct Y_impl { std::vector<int> values; Number length; } } struct Y : private Y_impl { void resize(unsigned n) { std::vector<int> s(n); swap(s,values); length = Number(n); } bool operator==(Y const& rhs) const { return this->values == rhs.values; } friend std::ostream& operator<<(std::ostream& s, Y const& a) { for (unsigned i = 0; i < a.length; ++i) std::cout << a.values[i] << " "; return s; }; };
Real-World Examples
Of course these examples are all somewhat contrived, but they are not
unrealistic. We’ve already found classes in the (new) standard
library—std::piecewise_linear_distribution::param_type and
std::piecewise_linear_distribution—that have been implemented in
exactly such a way as to expose the same problems. In particular,
they were shipped with g++4.5, which had no explicit move, and not
updated for g++4.6, which did. Thus they were broken by the
introduction of implicitly-generated move constructors.
Summary
I actually like implicit move. It would be a very good idea in a new language, where we didn’t have legacy code to consider. Unfortunately, it breaks fairly pedestrian-looking C++03 examples. We could continue to explore tweaks to the rules for implicit move generation, but each tweak we need to make eliminates implicit move for another category of types where it could have been useful, and weakens confidence that we have analyzed the situation correctly. And it’s very late in the standardization process to tolerate such uncertainty.
Conclusions
It is time to remove implicitly-generated move operations from the
draft. That suggestion may seem radical, but implicit move was
proposed
very late in the process on the premise that it “treated…the root
cause” of the exception-safety issues revealed in
N2855.
However, it did not treat those causes: we still needed
noexcept. Therefore, implicitly-generated move operations can be
removed without fundamentally undermining the usefulness or safety of
rvalue references.
The default semantics of the proposed implicit move operations are
still quite useful and commonly-needed. Therefore, while removing
implicit generation, we should retain the ability to produce those
semantics with “= default.” It would also be nice if the rules
allowed a more concise way to say “give me all the defaults for move
and copy assignment,” but this paper offers no such proposal.
Be Back Soon
Well, that concludes the committee paper. For all you C++Next’ers out there, I promise (again!) to put this proposal in context of our rvalue references series, Real Soon Now.
In C++03, std::remove eliminates values from a range by assigning
over them. Since it can’t actually change sequence structure, it
assigns over the unwanted elements with values from later in the
sequence, pushing everything toward the front until there’s a subrange
containing only what’s desired, and returns the new end
iterator of that subrange . For example, after removeing 0 from the
sequence 0 1 2 0 5, we’d end up with 1 2 5, and then 0 5—the last
two elements of the sequence would be unchanged.
In C++0x, we have move semantics, and std::remove has permission to
use move assignment. So in C++0x, we’d end up with 1 2 5 0 x at the
end of the sequence, where x is the value left over after moving from
the last element—if the elements are ints, that would be 5, but if
they are BigNums, it could be anything.
There’s another way moved-from values can be exposed to C++03 code
running under C++0x: an algorithm such as sort can throw an exception
while shuffling elements, and you can then observe a state where not
everything has been moved back into place. Showing that just makes for
more complicated examples, however.

@Dave – Since you linked this example on April 3rd 2012 … is this still something you think might be changed in a future standard? (I mean, we do have implicit move in the current C++11 now, do we?)
Has anyone come up with a sane compiler-warning regarding this issue?
cheers, Martin
We do have it in C++11. I don’t think it’ll be changed—the damage has been done; can’t un-ring that bell. I don’t know about the compiler warning; I haven’t seen one
Note: real-world manifestation of this issue can be observed at http://j.mp/implicit-move-bites-man
(I know it is a very late comment, but I’ll leave it here anyway) All this happens due to one blindingly obvious problem — C++11 move semantic is not move semantic. It is a move-and-init semantic! I was really disappointed when I realized that there is no move semantic in upcoming C++ standard (and likely will never be, thanks to rvalue refs). All these move constructors not only move data, they also initialize source. It is quite trivial (and desirable) to implicitly generate ‘move’ part, but it is impossible to generate ‘init’ part without any knowledge about object’s invariant.
Ideal solution should be to implement proper move semantic, but it is likely not possible — C++ is essentially a language for building stack machine that requires each object on stack to be ‘alive’ until it is destroyed due to execution leaving the scope (or stack unwinding). Therefore correct solution would be to introduce new type of constructor — init constructor, which will create an instance of object (without violating invariant and without throwing). Implicitly generated move-init constructor will use init constructor for ‘init’ part. If init ctor is unavailable — this will suppress generation of move ctor, we could declare that init ctor is present if default ctor is present and declared nothrow, and etc. All problems mentioned in this article go away once you add concept of init ctors into the language.
I’m personally not convinced it’s that simple, nor that your idea of move independent from initialization makes any sense…but it still might. I think you’d need more room than comments on this blog offer to explain it fully, though. If you write an article somewhere, I’ll certainly read it
nice try, but I am really not interested in becoming a famous author at this moment
Idea is pretty simple: move ctor takes src as const& and constructs a new object; before move ctor completes object is considered ‘partially moved’, after it is complete — target become constructed and src becomes invalid and needs to be init-ed to some valid state via init ctor. Hopefully compiler can recognize cases when init call can be skipped and programmer has ways to forbid or enforce init ctor call. Exceptions make this bit more complicated, but it is still manageable.
You know my email, ask me specific questions if smth does not make any sense.
Btw, I do not like implicitly generated ctors/etc. I’d rather ask for them explicitly.
Sounds like what you’re describing basically amounts to the “destructive move semantics” idea, which has been proposed many times (even in this thread). Among those of us actually designing the feature, nobody agreed with your assessment that it is “manageable.” I would want to see a rigorous explanation of how things like this work without introducing unreasonable inefficiencies:
I saw this paper before, but it does not look convincing to me. I see nothing wrong with partially moved state (equivalent of ‘lame-duck’ state mentioned in paper) — it is the same thing as ‘partially constructed’ state used in cctor (copy constructor). But you are missing main point of my argument — I do not argue for ‘destructive move’, it is probably not possible to implement efficiently (as I noted in ‘C++ is a language for building stack machines’ remark). Idea is to recognize that our ‘move’ is a ‘move-and-init’ operation and separate those on language level, thus giving compiler chance to avoid ‘init’ portion where possible (and giving developer a way to avoid or enforce init ctor call).
About your example with f(vector&) — yes, X’s dtor should be called, but compiler should be allowed to elide it, if it deems it necessary (either by introducing hidden flag on stack or if it could clearly separate execution paths for two cases — one where X is moved, another when it is not). From developer’s perspective, once value is moved out of variable, variable still holds valid value (according to init ctor, unless developer explicitly requested destructive move), but compiler has an option to detect that after move variable is unused and drop init ctor call (if circumstances allow). I.e. it is smth like NRVO — if you properly structure your code, compiler will make it faster by eliding a thing or two.
Dave, if you want detailed discussion — send me a email, commenting blog is quite an inconvenient format.
There is no “partially constructed” state. Either the object’s ctor has completed and it exists, or it hasn’t completed and it doesn’t exist (the fact that you can call member functions on a nonexistent object from its constructor is bizarre, but it doesn’t make the model fall apart).
I have no idea what you mean by “move” and “init” as separate ideas, since you haven’t defined them.
First, whether you are proposing destructive move or not, hidden flags and execution path separation are usually the first things people reach for when sketching out how to handle destructive move. That’s why I think this amounts to the same proposal.
Second, there are plenty of cases for which neither a hidden flag nor execution path separation are feasible without unreasonable inefficiencies. Consider, for example that a program may move at random from a
vector<X>.Sorry for the inconvenience, but I want to keep discussion where the community can benefit. Thanks for posting!
Then why it is mentioned in 15.2p2 of C++ 2003?
I apologize if my explanations were not clear, I’ll try again: - move ctor takes “X const& src” and creates a new value of type X (similar to cctor) - source variable becomes destroyed once target variable becomes fully constructed - since (in general case) we can’t leave it like this, compiler needs a way to construct a new valid value at ‘src’ address — init ctor (that does not throw) - standard grants compiler freedom in eliding ‘init’ step if it could prove that it is not required
What is not clear? You don’t expect me to describe it in a language that could be copy-pasted to standard, do you?
Maybe, call it ‘yet another destructive move idea +’, I do not really care… Original point was that adding init ctor concept solves all problems mentioned in your blog post.
Can’t see any problems… If compiler can’t find easy way to get rid of init+destroy calls — he’ll just leave them in place.
This blog entry is more than 1 year old — there is no community. Maybe move this discussion to comp.std.c++ or comp.lang.c++.moderated?
It may not be huge or hugely active, but there’s a community. I don’t think I’m alone in having learnt a lot about C++11 from both the articles and discussions in the comments on this site. Thanks for contributing.
Can’t believe someone actually reads or tracks comments to 1 year old blog post…
Yeah, who would do that?
Well, they set up a nice global RSS for the whole site…
Dave, you do not want to respond? but what about community? I am sure they are looking forward for us to continue…
Is perhaps the problem not with implicit move, but std::remove? Maybe std::remove should not call std::move but instead call “std::explicit_move” (which does a copy if the move is implicit, though I’m unsure how to implement this). Any existing code (e.g. std::remove) should call std::explicit_move if the “moved from” object has any chance of being accessed again. Things like std::sort should be able to use “std::move”, as there is no chance of the user accessing a “moved from” object after the call.
Making this change, along with destructors suppressing implicit move, are there any other issues with implicit move (particularly, any issues that don’t rely on modified behavior of the STL)?
Considering that the construction of an object is the memory allocation followed by the invocation of its constructor, and then that the destruction is the sequence of the destructor and the memory deallocation, could we imagine that the destructor of an object left after a move operation is not invoked (the memory deallocation remaining to be performed)? After all, the object being moved in memory, why should we consider the leftover object to be a “destructible” thing?
Answer is here: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2002/n1377.htm#Alternative%20move%20designs So you can forget my comments
Very nicely researched, Bourez! Thanks for answering your own question
What was the argument for allowing move-ops to throw, again?
I have a hard time remembering things that are so utterly meaningless from a technical POV. Or perhaps I’m just getting too old. But anyway, can’t remember that particular mumbo-jumbo.
But sometimes it pops up in discussions, and then all I can say to explain it is that it apparently was some kind of obscure committee politics, so, what was purported technical argument?
Not sure why you call it “meaningless from a technical POV”, and actually it was as close to politics-free as anything we do in the committee ever gets. I’ve been promising to write a follow-up article about this for some time, but until then, the rationale is here.
Referring to the technical reasons as “purported,” when you don’t actually understand what they were, unfairly discredits the good work done by several people in getting us to the right decision.
Swapping could also be used instead of moving, and it would solve the problem of invariants. I wrote a more detailed analysis, if anyone is interested:
http://thegreatlambda.blogspot.com/2010/10/c-move-semantics-alternative-that.html
I don’t understand why you would bother with such an analysis if moving is of no real benefit
Using default-construct plus swap is a well-known idea—it even makes some of the exception-safety issues go away for, e.g., resizing vectors, but
It’s of no real benefit since managing complex data structures by value is not something done often. It is known that the excessive copying will slow down a program, so people took care not to return complex data structures by value or store them by value in containers.
But now everyone seems to want to do that (I don’t), so we are searching for a solution.
I didn’t know that.
Perhaps it’s better to require objects to be default-constructible instead of the opposite: not providing a default constructor will lead to a compile-time error from the compiler, whereas invariants violation will go in silent until the problem manifests itself at run-time.
Furthermore, I don’t think that requiring objects to be default-constructible is such a bad idea. I think that this will create the minimum of problems, i.e. some trivial redesign of classes, in most cases: a properly-designed class that acquires a resource should check if the acquisition is successful, and so it will already do this check in its destructor.
Indeed, but what about assignment? in the case of assigment, a class has to cleanup itself before moving the data of the source object to self.
Yeah, if we were designing a language from scratch, that would be an option. The whole point of this article is that implicit move breaks legacy code. If we make it illegal to have no default constructor, we’ll break a whole lot more of it.
Except that
deletealready checks so why waste a check?I guess you never read “Your Next Assignment…”?
Achilleas Margaritis: “It’s of no real benefit since managing complex data structures by value is not something done often. It is known that the excessive copying will slow down a program, so people took care not to return complex data structures by value or store them by value in containers.”
This is a circular argument. When managing complex structures by value is cheap, the barrier to that technique will be removed and people can go ahead and use it. There are countless examples of how this makes programs more succinct and readable.
Explicit is better than implicit.
Defaulting doesn’t seem that onerous, relatively speaking. A set of standard macros would quell many objections (but raise some macro-related concerns):
struct Name { STD_MOVE_CTOR(Name) // expands to defaulted move ctor STD_MOVE_ASSIGN(Name) // same for op= //STD_MOVEABLE(Name) // combine above two // s/MOVE/COPY/ for 3 more //STD_COPYMOVEABLE(Name) // for all 6 };Except when it results in an onerous amount of boilerplate. I don’t know how often I’ll want the semantics of
= defaultfor move construction and assignment, but if it turns out to be common, I know I’ll be using a macro.Very valid points raised, but this is still just a plain old depressing proposal. Another case of “err-on-the-side-of-caution” inhibiting actually interesting development.
I selfishly still hope to see defaults move constructors/assignment, since the issues raised would not affect me personally, and the loss of the feature already would. Though I also selfishly think that the next-decade worth of code is more important than the decade-prior, so there’s that.
Welcome to standardization and the real world
However, I do hope to start an article series here soon that will be… how shall I say it… more “freeing.”
Is moving that significant? I’ve never seen a case where a complex class instance is returned as a result via the return statement. In most cases, the result complex class instance is passed as an argument to be filled by the function.
The academic community is trying to get away from destructive updates as much as possible, due to being difficult to reason about.
Personally, I’d make STL container classes (since most of the move concept is about returning STL containers) value-type classes that shared their internals via reference counting. Granted, it’s not that good as move from a performance point of view, but it’s way simpler and much easier to reason about (and it wouldn’t require the language to have rvalue references).
Heck, yeah, it does! Oh, did I neglect to post my graphs of Howard’s test results? Hmm… Oh, and his link to the code is down! I’ll try to remedy that and get something posted here ASAP
Exactly. Have you read this article? It’s time to stop being afraid of pass-by-value.
That’s not new, but total immutability can be a very costly paradigm. The underlying machine model does mutation, and once you remove that from the programming language, optimizers are, in general, not smart enough to get the efficiency back by rewriting non-mutating code as mutating. Mutable value semantics, as supported by C++, occupies a very promising middle ground between sharing everything as in Java, and pure functional programming as in Haskell.
It isn’t! They just make convenient demonstrations.
Then you’re talking about total immutability (or they won’t act like value types). And regardless of that, the reference counts mutate, so you need synchronization, which makes that model very bad for multithreaded programs.
Done. Please see the latest posting
That is only if you return complex classes by value.
Nice article, but misleading.
First of all, there is no referential transparency in the move example, because the return value is modified.
Secondly, if you want to create a const vector of strings, it means this code is invoked once in a program, so it’s not a bottleneck that would have to be optimized.
Thirdly, the code didn’t grew 150%, the example code did. In real life situations, declaring local variables to pass as out parameters to functions does not increase the code by …150% (!!!).
Forthly, you say “we no longer have value semantics”, as if that mattered, but you don’t say why it matters.
Promising for who? the developers? the compiler writers? you are not clear. Does it help the compiler perform better optimizations? does it help the developers reason more easily about the program? I really doubt about the latter.
I am not sure the degree of immutability introduced by c++ move concept can lead to the same optimizations as total immutability. Generally speaking, when something is mutable, the compiler can’t do the best optimizations it can do. I think total immutability leads to better optimizations.
It’s not ‘very bad’, unless you have a thread in your program that continuously modifies reference counts. The atomic increment/decrement overhead in real situations is negligible, and you are going to have it anyway, even in the presence of moving, if you want to use shared pointers and threads.
Personally, I don’t see why moving is of any real benefit. It’s not going to solve an existing problem, is it? nobody in their right minds ever returned big complex data structures by value. So there wasn’t a problem, so what does moving solve for us? nothing, in reality.
The next years are going to be quite interesting. I understand the excitement from the move concept, and I sincerely hope there is no trouble from moving things around. I certainly will hate it when I come back to code after a month and, having forgotten a move somewhere, my code crashes without explanation. I will also hate it if another developer in the project silently introduces a move, thinking that the moved data are not used anywhere else, and then the code crashes again.
The code in the example might be reading the contents of a file on some periodic basis. It might be doing a database lookup to get a dynamic set of names. You cannot know whether such a vector of strings need be created just once. That the vector is const merely means that the code in the function won’t modify it.
Noting that the variation requires an additional line of code would be a legitimate way to express the same thing without risk of hyperbole.
Value semantics make writing efficient code clearer. Naive C++ is inefficient because of the glut of temporaries created. Without value semantics, once must introduce other techniques to eliminate the temporaries such as non-const references (which are unclear from the caller’s perspective), expression templates, etc.
Unsubstantiated.
There are many “real situations” in which manipulations of containers or objects occur frequently and would lead to reference count manipulations. A library writer cannot know client usage patterns a priori and must, therefore, make things as efficient as possible.
That is the likely outcome of implicit move, but shouldn’t be the case for explicit move operations.
It is true that when the compiler can be absolutely sure that no data changes, it generally has an easier time optimizing code that’s written within those constraints. So if you want the best optimization results for a general pure-functional program I’ve no doubt that a compiler for a purely-functional language could produce better results than another. However:
constit doesn’t mean the thing can’t actually be changed, because theconstis attached to a pointer or reference and the same object could be referred to as non-constelsewhere.Three nits about your std::remove pop-up:
The moved from sequence isn’t 1 2 5 0 x, it is 1 2 5 x x
C++03 doesn’t specify that assignment will be used, though that is the common implementation technique. swap could also be used to assign the new elements, and would even be beneficial in cases where swap is cheaper than assignment. Therefore the resultant C++03 sequence is also 1 2 5 x x, but ‘x’ means unspecified value in C++03.
remove isn’t poorly specified (as asserted by Marc), nor is it the only std::algorithm with this characteristic (as implied by Marc). There is also remove_if and unique, all of which “shorten” sequences leaving zero or more unspecified values at the end.
That’s two nits about the pop-up and one beef with Marc
But why would
removeever swap with or move from that 2nd zero? I mean, I get that it’s allowed to, but I’d also like to know what’s realistic.Sorry, that’s not exactly what I meant. It looked from the article that the problem came more from std::remove giving people too high expectations about the state of the result than from implicit move (“poorly specified” is really about the impression the article gave, not about the standard, sorry if I gave the impression I was attacking anyone’s work (and I am not attacking the article either, I am asking for more)). Thank you for pointing out unique as an other example (remove_if and remove count as one to me), I was precisely asking for the number of functions affected with my scope comment. And if it is only these two, I am wondering whether it wouldn’t be better to change their C++0X specfication (somehow force them to use copy or swap or non-implicit move) and possibly provide fast alternatives documented with a big warning sign.
Now the throwing case is more worrying. Although I find it strange that elements involved in an algorithm that threw should be allowed to be used for anything beside affectation or destruction.
I understand the goal to avoid for objects to ever be in a state they couldn’t have been in in C++03. It is just that to me, moved-from objects shouldn’t be used anyway, so unless they have some special destructor or affectation, it shouldn’t matter.
Weighing the pros and cons is something you’ve both already done, I am just trying to catch up and expect I’ll end up agreeing with you. Defending the opposite position seems like a good way to get all the arguments in favor, although it’s hard to do without sounding aggressive or disparaging.
Could you be more specific? We aim to please!
There’s no umbrage here, bro; please, keep it up! BTW, exception-safety guarantees for C++03 and move semantics for C++0x were both specifically designed not to create some special zombie state—it turns out you basically never need to do that, and if you do, in the end, you’re only punishing yourself.
Is that really the best solution? I can understand not generating a move constructor when there is a destructor. But the examples with std::remove are not convincing at all. Are there other circumstances where it may fail? After reading this post, the impression I get is that there is one badly specified algorithm in the library that can be abused to cause “bad things”. My reading of the definition of std::remove is that it doesn’t guarantee anything about the extra elements and using them for anything other than destruction is UB. But even if I am wrong, we could just specify that std::remove is only allowed to copy (and create a well documented _move variant).
I am perfectly willing to believe that there are real issues, but this paper fails to show their scope.
There’s a recently-added expandable note just before tweak #2 that explains some of this; have you seen it?
Yes, I had seen it. As far as I understand, the expectation that the sequence ends in 0 5 is not guaranteed by C++03. And if I am wrong it would still be simpler to specify remove as not using moves but only copies.
Now I had somehow missed the last paragraph about throwing during std::sort. I need to think about that, but I am not sure in what kind of state you expect to find your sequence if something (a move constructor? the predicate?) managed to throw during the sort. I can’t find any guarantee in the standard (the only occurrence of “throw” in section 25 is for qsort/bsearch).
Correct. However, once the elements are user-defined types, the library can’t use any operations on them other than those defined in the algorithm requirements, so the values would be, at worst, something within the invariant maintained by the existing copy assignment operator (let’s not discuss abominations like a mutating
operator==). So afterwards, those are the only states you can observe, in C++03. In C++0x, you’ll also be able to observe the moved-from state, which might be different.…and way, way slower.
Don’t worry, the guarantees are in there; I saw to it personally
. Again, in C++03, the only states available to elements of user-defined type are those that can be reached through the requirements of the algorithm.
Hi, I believe someone in comp.std.c++ suggested that the default move constructor should be implemented as a swap. This would keep the invariants. Is that not feasible? Or would it create a loop where default move calls swap, and default swap calls move?
Regards, &rzej
It would keep the invariants, provided you had another object (in a good state) to swap into. Remember, construction makes new objects. So then, presumably, you need a default constructor… which may, or may not, exist and set up the required state.
In my opinion, this isn’t fundamentally a problem with implicit move generation, it’s a weakening of the results from functions that move. We no longer guarantee a valid object will be left, but only that a moved-from object will be left. I personally believe it is fine to say that this may break an invariant in a class that is not expecting it. The only caveat is that if there is a user-declared destructor, this function will be invoked and may rely on an invariant, and thus we should not make any assumptions and so not generate a move constructor.
In other words, I think we don’t need to maintain backwards-compatibility in pathological cases in favor of adding a useful language feature. We can easily-enough add specification to algorithms that move that if they can’t move, they copy, and then say “Oh, they changed std::remove so that it doesn’t guarantee valid state past the end of the new array by default. Here’s the two lines you need to make it work.” where the one line is deleting the move assignment operator and constructor.
The fundamental problem is that if you let the compiler write the move constructor, a moved-from object may not be a valid object.
That doesn’t account for algorithms (and container member functions that use move). Often the next operation on a moved-from object is assignment, not destruction.
Also, I don’t understand peoples’ interest in user-declared destructors. Ultimately, the surest (though still imperfect) sign that the author intended an invariant that could be broken by a generated move operation is a user-declared constructor.
Even if you hold that point-of-view (and not everyone does), the problem is that we have very little to go on in making a judgement about whether the cases that would be broken merit the label “pathological.” This close to finalization of the standard, we’ve only just now noticed this issue. Even if you think all the examples in the paper are “pathological” (I don’t), the chances are pretty good that there are non-pathological cases too.
That’s already in the specification.
Some may be comfortable saying that upgrading your working C++03 code to C++0x can introduce “invalid states” (essentially, undefined behavior), but I am not.
If we were to do that, then how would we define the default move-assignment operator? Not as a simple swap of the source and target objects, because the target object might own subobjects that need to be disposed of. Maybe default-construct a temporary object, swap it with the target, then swap the target with the source. Then after the move-assignment the source contains a default-constructed object, the target contains the state that was originally in the source, and the temporary object, which is about to be destructed, contains the state that was originally in the target.
Hi Joe,
Have you seen the “canonical assignment” section of this article?
I was the one who “proposed” a tweak that removes implicit moves in case any special member function is user-declared. But even then, your std::remove example would break. Good catch!
In a related matter (still rvalue references) can you point out what bullet point in the draft handles the situation you have if you push_back a string literal on a vector of std::string? Which overload will the compiler pick? On one hand the types are not reference-related and there will be a conversion yielding a temporary string object. On the other hand the initializer (string literal) is an lvalue expression. It seems that situations like these should lead to the push_back(string&&) overload being picked. But that’s not what the current draft seems to dictate.
Sorry Sebastian, I don’t have any special insight into this question. I suggest writing up an issue and postingi it at comp.std.c++
Dave: “…I suggest writing up an issue and postingi it at comp.std.c++”.
Done! I just thought you might have something interesting to say to it as one of the authors of N2831.
There is another reason that these tweaks are unacceptable, besides that they don’t solve the problem in edge cases: they are extremely unintuitive! Having move constructors be auto-generated except when is not teachable, not usable, and would just look ridiculous. Make =default work for move constructors and move assignment and never auto-generate it, thus having a simple rule in the language that can be learned.
C++ has far too many edge cases already; it’s one of the biggest problems of the language.
I see two problems in the samples you provided:
New implicit function can break invariants that class is trying to maintain in all other functions. This can be worked around in the way you said: suppress implicit move for classes with default dtors.
POD types aren’t zeroed-out on move. All other samples fall under this category. If they were zeroed after move, no problems appear (except of really weird examples).
So IMHO there are still ways to keep implicit move in C++0x. Without this move would be used only in standard library classes as it is really too large amount of work to be done to add move ctors to all classes in a large application
Suppress implicit move for classes with default destructors? Now that’s an approach I didn’t think to explore! But it wouldn’t work even for the first example in this article. As for the opposite approach, it doesn’t work for any of the other examples. I thought the article demonstrated all of that pretty clearly, but if you can’t see it, I must have some explaining left to do… but I just don’t know what else I need to say.
I don’t know what you mean about POD types being zeroed, but I can promise you, that doesn’t solve anything fundamentally. There’s no a priori reason to think that a zero value lies within any given class’s invariant.
Maybe it would make sense to roll this together with the base_check functionality? That already changes the semantics of classes in a backwards incompatible way, and I think they are looking for a keyword, so it might make sense to activate both semantic changes with the same keyword.
def_class {}; def_struct {};
Could create “new style” classes and structs with both a default move constructor, and support for new override semantics that base_check provides.
Another change that would be good to throw in would be to not make single argument constructors implicit by default. Instead, make them explicit, and have a implicit keyword for cases where you want that behavior.
Probably this is too big a change for this point in the standardization process though…
Sorry, Brendan; I haven’t meant to ignore you. These are really interesting ideas, but you’re right—it’s way too late for anything that dramatic!