Implicit Move Must Go

Hi everybody. The bulk of this article is going to be a C++ Standards Committee document, but I’m posting it here because of its relevance to the article series on rvalue references we’ve been running here at C++Next. I’ve been promising a followup to that series, and having this article on the site is actually going to be useful when I write that followup.

But first, behold! The latest issue with rvalue references and move semantics…


The final version of this paper, as submitted to the committee, is N3153.

What is Implicit Move?

In an earlier article, we described a surprising effect that occurs when “legacy” C++03 types are combined in the same object with move-enabled C++0x types: the combined type could acquire a throwing move constructor. At that time, we didn’t have a way to implement vector::push_back and a few other important strong-guarantee operations in the presence of a throwing move.

Several independent efforts were made to deal with this problem. One approach, while not a complete solution, shrank the problem space considerably and had many other benefits: implicitly generate move constructors and move assignment operators when not supplied by the user, in much the same way that we do for copy constructors and copy assignment operators. That is the implicit move feature, and it was voted into the “working paper” (draft standard) this past spring.

Further Background

Another effort, spurred on by Rani Sharoni’s comments at C++Next finally yielded a complete, standalone solution to the problem, a form of which was eventually also adopted by the committee. Nobody attempted to “repeal” the implicit generation of move operations from the standard document, not least because of those “many other benefits” I alluded to earlier. But now we have a problem that we hadn’t anticipated.

The Problem

Back in August, Scott Meyers posted to comp.lang.c++ about a problem where implicit generation of move constructors could break C++03 class invariants. For example, the following valid C++03 program would be broken under the current C++0x rules (and indeed, is broken in g++4.6 with --std=c++0x) by the implicit generation of a move constructor:

#define _GLIBCXX_DEBUG
#include <iostream>
#include <vector>
struct X
{
    // invariant: v.size() == 5
    X() : v(5) {}
 
    ~X()
    {
        std::cout << v[0] << std::endl;
    }
 
 private:    
    std::vector<int> v;
};
 
int main()
{
    std::vector<X> y;
    y.push_back(X()); // X() rvalue: copied in C++03, moved in C++0x
}

The key problem here is that in C++03, X had an invariant that its v member always had 5 elements. X::~X() counted on that invariant, but the newly-introduced move constructor moved from v, thereby setting its length to zero.

Tweak #1: Destructors Suppress Implicit Move

Because rvalues are generally “about to be destroyed,” and the broken invariant was only detected in X‘s destructor, it’s tempting to think that we can “tweak” the current rules by preventing the generation of implicit move constructors when a user-defined destructor is present. However, the following example would still break (and breaks under g++4.6 with --std=c++0x):

#define _GLIBCXX_DEBUG
#include <algorithm>
#include <iostream>
#include <vector>
 
struct Y
{
    // invariant: values.size() > 0
    Y() : values(1, i++) {}
    Y(int n) : values(1,n) {}
 
    bool operator==(Y const& rhs) const
    { 
        return this->values == rhs.values;
    }
 
    int operator[](unsigned i) const
    { return values[i]; }
 
 private:
    static int i;
    std::vector<int> values;
};
 
int Y::i = 0;
 
int main()
{   
    Y ys[10];
    std::remove(&ys[0], &ys[0]+10, Y(5));
    std::cout << ys[9][0];
};

In C++03, there’s no way to create a Y with a zero-length values member, but because std::remove is allowed to use move operations in C++0x, it can leave a moved-from Y at the end of the array, and that could be empty, causing undefined behavior in the last line.

About the use of std::remove in these examples»

Tweak #2: Constructors Suppress Implicit Move

It’s also tempting to think that at least in classes without a user-defined constructor, we could safely conclude that there’s no intention to maintain an invariant, but that reasoning, too, is flawed:

#define _GLIBCXX_DEBUG
#include <iostream>
#include <vector>
 
// An always-initialized wrapper for unsigned int
struct Number
{
    Number(unsigned x = 0) : value(x) {}
    operator unsigned() const { return value; }
 private:
    unsigned value;
};
 
struct Y
{
    // Invariant: length == values.size().  Default ctor is fine.
 
    // Maintains the invariant
    void resize(unsigned n)
    {
        std::vector<int> s(n);
        swap(s,values);
        length = Number(n);
    }
 
    bool operator==(Y const& rhs) const
    { 
        return this->values == rhs.values;
    }
 
    friend std::ostream& operator<<(std::ostream& s, Y const& a)
    {
        for (unsigned i = 0; i < a.length; ++i)
            std::cout << a.values[i] << " ";
        return s;
    };
 
 private:
    std::vector<int> values;
    Number length;
};
 
int main()
{   
    std::vector<Y> z(1, Y());
 
    Y a;
    a.resize(2);
    z.push_back(a);
 
    std::remove(z.begin(), z.end(), Y());
    std::cout << z[1] << std::endl;
};

In this case, the invariant that length == values.size() was established by the well-understood default-construction behavior of subobjects, but implicit move generation has violated it.

Tweak #3: Private Members Suppress Implicit Move

It’s also tempting to think that we can use private members to indicate that an invariant needs to be preserved; that a “C-style struct” is not encapsulated and has no need of protection. But the members of a privately-inherited struct are effectively encapsulated and private with respect to the derived class. It is not uncommon to see members moved into an implementation detail struct, which is then used as a base class:

// Modified fragment of previous example.  Replace the definition
// of Y with this code.
 
namespace detail
{
  struct Y_impl
  {
      std::vector<int> values;
      Number length;
  }
}
 
struct Y : private Y_impl
{
    void resize(unsigned n)
    {
        std::vector<int> s(n);
        swap(s,values);
        length = Number(n);
    }
 
    bool operator==(Y const& rhs) const
    { 
        return this->values == rhs.values;
    }
 
    friend std::ostream& operator<<(std::ostream& s, Y const& a)
    {
        for (unsigned i = 0; i < a.length; ++i)
            std::cout << a.values[i] << " ";
        return s;
    };
};

Real-World Examples

Of course these examples are all somewhat contrived, but they are not unrealistic. We’ve already found classes in the (new) standard library—std::piecewise_linear_distribution::param_type and std::piecewise_linear_distribution—that have been implemented in exactly such a way as to expose the same problems. In particular, they were shipped with g++4.5, which had no explicit move, and not updated for g++4.6, which did. Thus they were broken by the introduction of implicitly-generated move constructors.

Summary

I actually like implicit move. It would be a very good idea in a new language, where we didn’t have legacy code to consider. Unfortunately, it breaks fairly pedestrian-looking C++03 examples. We could continue to explore tweaks to the rules for implicit move generation, but each tweak we need to make eliminates implicit move for another category of types where it could have been useful, and weakens confidence that we have analyzed the situation correctly. And it’s very late in the standardization process to tolerate such uncertainty.

Conclusions

It is time to remove implicitly-generated move operations from the draft. That suggestion may seem radical, but implicit move was proposed very late in the process on the premise that it “treated…the root cause” of the exception-safety issues revealed in N2855. However, it did not treat those causes: we still needed noexcept. Therefore, implicitly-generated move operations can be removed without fundamentally undermining the usefulness or safety of rvalue references.

The default semantics of the proposed implicit move operations are still quite useful and commonly-needed. Therefore, while removing implicit generation, we should retain the ability to produce those semantics with “= default.” It would also be nice if the rules allowed a more concise way to say “give me all the defaults for move and copy assignment,” but this paper offers no such proposal.


Be Back Soon

Well, that concludes the committee paper. For all you C++Next’ers out there, I promise (again!) to put this proposal in context of our rvalue references series, Real Soon Now.

In C++03, std::remove eliminates values from a range by assigning over them. Since it can’t actually change sequence structure, it assigns over the unwanted elements with values from later in the sequence, pushing everything toward the front until there’s a subrange containing only what’s desired, and returns the new end iterator of that subrange . For example, after removeing 0 from the sequence 0 1 2 0 5, we’d end up with 1 2 5, and then 0 5—the last two elements of the sequence would be unchanged.

In C++0x, we have move semantics, and std::remove has permission to use move assignment. So in C++0x, we’d end up with 1 2 5 0 x at the end of the sequence, where x is the value left over after moving from the last element—if the elements are ints, that would be 5, but if they are BigNums, it could be anything.

There’s another way moved-from values can be exposed to C++03 code running under C++0x: an algorithm such as sort can throw an exception while shuffling elements, and you can then observe a state where not everything has been moved back into place. Showing that just makes for more complicated examples, however.

Powered by Hackadelic Sliding Notes 1.6.5
Posted Monday, October 11th, 2010 under Value Semantics.

60 Responses to “Implicit Move Must Go”

  1. Martin Ba says:

    @Dave – Since you linked this example on April 3rd 2012 … is this still something you think might be changed in a future standard? (I mean, we do have implicit move in the current C++11 now, do we?)

    Has anyone come up with a sane compiler-warning regarding this issue?

    cheers, Martin

      Quote
    • We do have it in C++11. I don’t think it’ll be changed—the damage has been done; can’t un-ring that bell. I don’t know about the compiler warning; I haven’t seen one

        Quote
  2. Note: real-world manifestation of this issue can be observed at http://j.mp/implicit-move-bites-man

      Quote
  3. CM says:

    (I know it is a very late comment, but I’ll leave it here anyway) All this happens due to one blindingly obvious problem — C++11 move semantic is not move semantic. It is a move-and-init semantic! I was really disappointed when I realized that there is no move semantic in upcoming C++ standard (and likely will never be, thanks to rvalue refs). All these move constructors not only move data, they also initialize source. It is quite trivial (and desirable) to implicitly generate ‘move’ part, but it is impossible to generate ‘init’ part without any knowledge about object’s invariant.

    Ideal solution should be to implement proper move semantic, but it is likely not possible — C++ is essentially a language for building stack machine that requires each object on stack to be ‘alive’ until it is destroyed due to execution leaving the scope (or stack unwinding). Therefore correct solution would be to introduce new type of constructor — init constructor, which will create an instance of object (without violating invariant and without throwing). Implicitly generated move-init constructor will use init constructor for ‘init’ part. If init ctor is unavailable — this will suppress generation of move ctor, we could declare that init ctor is present if default ctor is present and declared nothrow, and etc. All problems mentioned in this article go away once you add concept of init ctors into the language.

      Quote
    • I’m personally not convinced it’s that simple, nor that your idea of move independent from initialization makes any sense…but it still might. I think you’d need more room than comments on this blog offer to explain it fully, though. If you write an article somewhere, I’ll certainly read it :-)

        Quote
      • CM says:

        nice try, but I am really not interested in becoming a famous author at this moment :-)

        Idea is pretty simple: move ctor takes src as const& and constructs a new object; before move ctor completes object is considered ‘partially moved’, after it is complete — target become constructed and src becomes invalid and needs to be init-ed to some valid state via init ctor. Hopefully compiler can recognize cases when init call can be skipped and programmer has ways to forbid or enforce init ctor call. Exceptions make this bit more complicated, but it is still manageable.

        You know my email, ask me specific questions if smth does not make any sense.

        Btw, I do not like implicitly generated ctors/etc. I’d rather ask for them explicitly. :-)

          Quote
        • Sounds like what you’re describing basically amounts to the “destructive move semantics” idea, which has been proposed many times (even in this thread). Among those of us actually designing the feature, nobody agreed with your assessment that it is “manageable.” I would want to see a rigorous explanation of how things like this work without introducing unreasonable inefficiencies:

          void f(std::vector<X>& v)
          {
              X a;
              if (random()) v.push_back(std::move(a));
              // lots of other code here...
           
              // Does a's destructor run?
          }
            Quote
          • CM says:

            I saw this paper before, but it does not look convincing to me. I see nothing wrong with partially moved state (equivalent of ‘lame-duck’ state mentioned in paper) — it is the same thing as ‘partially constructed’ state used in cctor (copy constructor). But you are missing main point of my argument — I do not argue for ‘destructive move’, it is probably not possible to implement efficiently (as I noted in ‘C++ is a language for building stack machines’ remark). Idea is to recognize that our ‘move’ is a ‘move-and-init’ operation and separate those on language level, thus giving compiler chance to avoid ‘init’ portion where possible (and giving developer a way to avoid or enforce init ctor call).

            About your example with f(vector&) — yes, X’s dtor should be called, but compiler should be allowed to elide it, if it deems it necessary (either by introducing hidden flag on stack or if it could clearly separate execution paths for two cases — one where X is moved, another when it is not). From developer’s perspective, once value is moved out of variable, variable still holds valid value (according to init ctor, unless developer explicitly requested destructive move), but compiler has an option to detect that after move variable is unused and drop init ctor call (if circumstances allow). I.e. it is smth like NRVO — if you properly structure your code, compiler will make it faster by eliding a thing or two.

            Dave, if you want detailed discussion — send me a email, commenting blog is quite an inconvenient format. :)

              Quote
          • CM:I see nothing wrong with partially moved state (equivalent of ‘lame-duck’ state mentioned in paper) — it is the same thing as ‘partially constructed’ state used in cctor (copy constructor).

            There is no “partially constructed” state. Either the object’s ctor has completed and it exists, or it hasn’t completed and it doesn’t exist (the fact that you can call member functions on a nonexistent object from its constructor is bizarre, but it doesn’t make the model fall apart).

            But you are missing main point of my argument — I do not argue for ‘destructive move’, it is probably not possible to implement efficiently (as I noted in ‘C++ is a language for building stack machines’ remark). Idea is to recognize that our ‘move’ is a ‘move-and-init’ operation and separate those on language level, thus giving compiler chance to avoid ‘init’ portion where possible (and giving developer a way to avoid or enforce init ctor call).

            I have no idea what you mean by “move” and “init” as separate ideas, since you haven’t defined them.

            About your example with f(vector&) — yes, X’s dtor should be called, but compiler should be allowed to elide it, if it deems it necessary (either by introducing hidden flag on stack or if it could clearly separate execution paths for two cases — one where X is moved, another when it is not).

            First, whether you are proposing destructive move or not, hidden flags and execution path separation are usually the first things people reach for when sketching out how to handle destructive move. That’s why I think this amounts to the same proposal.
            Second, there are plenty of cases for which neither a hidden flag nor execution path separation are feasible without unreasonable inefficiencies. Consider, for example that a program may move at random from a vector<X>.

            Dave, if you want detailed discussion — send me a email, commenting blog is quite an inconvenient format.

            Sorry for the inconvenience, but I want to keep discussion where the community can benefit. Thanks for posting!

              Quote
          • CM says:

            There is no “partially constructed” state

            Then why it is mentioned in 15.2p2 of C++ 2003?

            I have no idea what you mean by “move” and “init” as separate ideas, since you haven’t defined them.

            I apologize if my explanations were not clear, I’ll try again: - move ctor takes “X const& src” and creates a new value of type X (similar to cctor) - source variable becomes destroyed once target variable becomes fully constructed - since (in general case) we can’t leave it like this, compiler needs a way to construct a new valid value at ‘src’ address — init ctor (that does not throw) - standard grants compiler freedom in eliding ‘init’ step if it could prove that it is not required

            What is not clear? You don’t expect me to describe it in a language that could be copy-pasted to standard, do you? :-)

            That’s why I think this amounts to the same proposal.

            Maybe, call it ‘yet another destructive move idea +’, I do not really care… Original point was that adding init ctor concept solves all problems mentioned in your blog post.

            Consider, for example that a program may move at random from a vector

            Can’t see any problems… If compiler can’t find easy way to get rid of init+destroy calls — he’ll just leave them in place.

            Sorry for the inconvenience, but I want to keep discussion where the community can benefit.

            This blog entry is more than 1 year old — there is no community. Maybe move this discussion to comp.std.c++ or comp.lang.c++.moderated?

              Quote
          • James Hopkin says:
            This blog entry is more than 1 year old — there is no community. Maybe move this discussion to comp.std.c++ or comp.lang.c++.moderated?

            It may not be huge or hugely active, but there’s a community. I don’t think I’m alone in having learnt a lot about C++11 from both the articles and discussions in the comments on this site. Thanks for contributing.

              Quote
          • CM says:

            Can’t believe someone actually reads or tracks comments to 1 year old blog post… :-)

              Quote
          • ech says:
            CM: Can’t believe someone actually reads or tracks comments to 1 year old blog post…

            Yeah, who would do that?

              Quote
          • Marc says:
            CM: Can’t believe someone actually reads or tracks comments to 1 year old blog post…

            Well, they set up a nice global RSS for the whole site…

              Quote
          • CM says:

            Dave, you do not want to respond? but what about community? I am sure they are looking forward for us to continue… ;-)

              Quote
  4. Clinton says:

    Is perhaps the problem not with implicit move, but std::remove? Maybe std::remove should not call std::move but instead call “std::explicit_move” (which does a copy if the move is implicit, though I’m unsure how to implement this). Any existing code (e.g. std::remove) should call std::explicit_move if the “moved from” object has any chance of being accessed again. Things like std::sort should be able to use “std::move”, as there is no chance of the user accessing a “moved from” object after the call.

    Making this change, along with destructors suppressing implicit move, are there any other issues with implicit move (particularly, any issues that don’t rely on modified behavior of the STL)?

      Quote
  5. Bourez says:

    Considering that the construction of an object is the memory allocation followed by the invocation of its constructor, and then that the destruction is the sequence of the destructor and the memory deallocation, could we imagine that the destructor of an object left after a move operation is not invoked (the memory deallocation remaining to be performed)? After all, the object being moved in memory, why should we consider the leftover object to be a “destructible” thing?

      Quote
  6. What was the argument for allowing move-ops to throw, again?

    I have a hard time remembering things that are so utterly meaningless from a technical POV. Or perhaps I’m just getting too old. But anyway, can’t remember that particular mumbo-jumbo.

    But sometimes it pops up in discussions, and then all I can say to explain it is that it apparently was some kind of obscure committee politics, so, what was purported technical argument?

      Quote
    • Not sure why you call it “meaningless from a technical POV”, and actually it was as close to politics-free as anything we do in the committee ever gets. I’ve been promising to write a follow-up article about this for some time, but until then, the rationale is here.

      Referring to the technical reasons as “purported,” when you don’t actually understand what they were, unfairly discredits the good work done by several people in getting us to the right decision.

        Quote
  7. Achilleas Margaritis says:

    Swapping could also be used instead of moving, and it would solve the problem of invariants. I wrote a more detailed analysis, if anyone is interested:

    http://thegreatlambda.blogspot.com/2010/10/c-move-semantics-alternative-that.html

      Quote
    • I don’t understand why you would bother with such an analysis if moving is of no real benefit ;-)

      Using default-construct plus swap is a well-known idea—it even makes some of the exception-safety issues go away for, e.g., resizing vectors, but

      1. it’s not a universal solution because the type must have a default constructor, and
      2. your nice article is misleading in that it speculates that swap could be more efficient, when in fact, it’s significantly less efficient to default construct and swap than to perform a real move in most cases. Swap can’t “equal move” in that sense.
        Quote
      • Achilleas Margaritis says:
        I don’t understand why you would bother with such an analysis if moving is of no real benefit

        It’s of no real benefit since managing complex data structures by value is not something done often. It is known that the excessive copying will slow down a program, so people took care not to return complex data structures by value or store them by value in containers.

        But now everyone seems to want to do that (I don’t), so we are searching for a solution.

        Using default-construct plus swap is a well-known idea—it even makes some of the exception-safety issues go away for, e.g., resizing vectors

        I didn’t know that.

        it’s not a universal solution because the type must have a default constructor

        Perhaps it’s better to require objects to be default-constructible instead of the opposite: not providing a default constructor will lead to a compile-time error from the compiler, whereas invariants violation will go in silent until the problem manifests itself at run-time.

        Furthermore, I don’t think that requiring objects to be default-constructible is such a bad idea. I think that this will create the minimum of problems, i.e. some trivial redesign of classes, in most cases: a properly-designed class that acquires a resource should check if the acquisition is successful, and so it will already do this check in its destructor.

        your nice article is misleading in that it speculates that swap could be more efficient, when in fact, it’s significantly less efficient to default construct and swap than to perform a real move in most cases. Swap can’t “equal move” in that sense.

        Indeed, but what about assignment? in the case of assigment, a class has to cleanup itself before moving the data of the source object to self.

          Quote
        • Perhaps it’s better to require objects to be default-constructible instead of the opposite: not providing a default constructor will lead to a compile-time error from the compiler, whereas invariants violation will go in silent until the problem manifests itself at run-time.

          Yeah, if we were designing a language from scratch, that would be an option. The whole point of this article is that implicit move breaks legacy code. If we make it illegal to have no default constructor, we’ll break a whole lot more of it.

          Furthermore, I don’t think that requiring objects to be default-constructible is such a bad idea. I think that this will create the minimum of problems, i.e. some trivial redesign of classes, in most cases: a properly-designed class that acquires a resource should check if the acquisition is successful, and so it will already do this check in its destructor.

          Except that delete already checks so why waste a check? :-)

          your nice article is misleading in that it speculates that swap could be more efficient, when in fact, it’s significantly less efficient to default construct and swap than to perform a real move in most cases. Swap can’t “equal move” in that sense.
          Indeed, but what about assignment? in the case of assigment, a class has to cleanup itself before moving the data of the source object to self.

          I guess you never read “Your Next Assignment…”?

            Quote
        • Achilleas Margaritis: “It’s of no real benefit since managing complex data structures by value is not something done often. It is known that the excessive copying will slow down a program, so people took care not to return complex data structures by value or store them by value in containers.”

          This is a circular argument. When managing complex structures by value is cheap, the barrier to that technique will be removed and people can go ahead and use it. There are countless examples of how this makes programs more succinct and readable.

            Quote
  8. Roger Pate says:

    Explicit is better than implicit.

    Defaulting doesn’t seem that onerous, relatively speaking. A set of standard macros would quell many objections (but raise some macro-related concerns):

    struct Name {
      STD_MOVE_CTOR(Name)   // expands to defaulted move ctor
      STD_MOVE_ASSIGN(Name) // same for op=
      //STD_MOVEABLE(Name)  // combine above two
     
      // s/MOVE/COPY/ for 3 more
     
      //STD_COPYMOVEABLE(Name)  // for all 6
    };
      Quote
    • *Explicit is better than implicit.*

      Except when it results in an onerous amount of boilerplate. I don’t know how often I’ll want the semantics of = default for move construction and assignment, but if it turns out to be common, I know I’ll be using a macro.

        Quote
  9. Michal Mocny says:

    Very valid points raised, but this is still just a plain old depressing proposal. Another case of “err-on-the-side-of-caution” inhibiting actually interesting development.

    I selfishly still hope to see defaults move constructors/assignment, since the issues raised would not affect me personally, and the loss of the feature already would. Though I also selfishly think that the next-decade worth of code is more important than the decade-prior, so there’s that.

      Quote
    • Very valid points raised, but this is still just a plain old depressing proposal.Another case of “err-on-the-side-of-caution” inhibiting actually interesting development.

      Welcome to standardization and the real world ;-)

      However, I do hope to start an article series here soon that will be… how shall I say it… more “freeing.”

        Quote
  10. Achilleas Margaritis says:

    Is moving that significant? I’ve never seen a case where a complex class instance is returned as a result via the return statement. In most cases, the result complex class instance is passed as an argument to be filled by the function.

    The academic community is trying to get away from destructive updates as much as possible, due to being difficult to reason about.

    Personally, I’d make STL container classes (since most of the move concept is about returning STL containers) value-type classes that shared their internals via reference counting. Granted, it’s not that good as move from a performance point of view, but it’s way simpler and much easier to reason about (and it wouldn’t require the language to have rvalue references).

      Quote
    • Is moving that significant?

      Heck, yeah, it does! Oh, did I neglect to post my graphs of Howard’s test results? Hmm… Oh, and his link to the code is down! I’ll try to remedy that and get something posted here ASAP

      I’ve never seen a case where a complex class instance is returned as a result via the return statement. In most cases, the result complex class instance is passed as an argument to be filled by the function.

      Exactly. Have you read this article? It’s time to stop being afraid of pass-by-value.

      The academic community is trying to get away from destructive updates as much as possible, due to being difficult to reason about.

      That’s not new, but total immutability can be a very costly paradigm. The underlying machine model does mutation, and once you remove that from the programming language, optimizers are, in general, not smart enough to get the efficiency back by rewriting non-mutating code as mutating. Mutable value semantics, as supported by C++, occupies a very promising middle ground between sharing everything as in Java, and pure functional programming as in Haskell.

      Personally, I’d make STL container classes (since most of the move concept is about returning STL containers)

      It isn’t! They just make convenient demonstrations.

      value-type classes that shared their internals via reference counting.

      Then you’re talking about total immutability (or they won’t act like value types). And regardless of that, the reference counts mutate, so you need synchronization, which makes that model very bad for multithreaded programs.

        Quote
      • Oh, did I neglect to post my graphs of Howard’s test results? Hmm… Oh, and his link to the code is down! I’ll try to remedy that and get something posted here ASAP

        Done. Please see the latest posting

          Quote
      • Achilleas Margaritis says:
        Heck, yeah, it does!

        That is only if you return complex classes by value.

        Exactly. Have you read this article? It’s time to stop being afraid of pass-by-value.

        Nice article, but misleading.

        First of all, there is no referential transparency in the move example, because the return value is modified.

        Secondly, if you want to create a const vector of strings, it means this code is invoked once in a program, so it’s not a bottleneck that would have to be optimized.

        Thirdly, the code didn’t grew 150%, the example code did. In real life situations, declaring local variables to pass as out parameters to functions does not increase the code by …150% (!!!).

        Forthly, you say “we no longer have value semantics”, as if that mattered, but you don’t say why it matters.

        Mutable value semantics, as supported by C++, occupies a very promising middle ground between sharing everything as in Java, and pure functional programming as in Haskell.

        Promising for who? the developers? the compiler writers? you are not clear. Does it help the compiler perform better optimizations? does it help the developers reason more easily about the program? I really doubt about the latter.

        Then you’re talking about total immutability (or they won’t act like value types).

        I am not sure the degree of immutability introduced by c++ move concept can lead to the same optimizations as total immutability. Generally speaking, when something is mutable, the compiler can’t do the best optimizations it can do. I think total immutability leads to better optimizations.

        And regardless of that, the reference counts mutate, so you need synchronization, which makes that model very bad for multithreaded programs.

        It’s not ‘very bad’, unless you have a thread in your program that continuously modifies reference counts. The atomic increment/decrement overhead in real situations is negligible, and you are going to have it anyway, even in the presence of moving, if you want to use shared pointers and threads.

        Personally, I don’t see why moving is of any real benefit. It’s not going to solve an existing problem, is it? nobody in their right minds ever returned big complex data structures by value. So there wasn’t a problem, so what does moving solve for us? nothing, in reality.

        The next years are going to be quite interesting. I understand the excitement from the move concept, and I sincerely hope there is no trouble from moving things around. I certainly will hate it when I come back to code after a month and, having forgotten a move somewhere, my code crashes without explanation. I will also hate it if another developer in the project silently introduces a move, thinking that the moved data are not used anywhere else, and then the code crashes again.

          Quote
        • Rob says:
          Secondly, if you want to create a const vector of strings, it means this code is invoked once in a program, so it’s not a bottleneck that would have to be optimized.

          The code in the example might be reading the contents of a file on some periodic basis. It might be doing a database lookup to get a dynamic set of names. You cannot know whether such a vector of strings need be created just once. That the vector is const merely means that the code in the function won’t modify it.

          Thirdly, the code didn’t grew 150%, the example code did. In real life situations, declaring local variables to pass as out parameters to functions does not increase the code by …150% (!!!).

          Noting that the variation requires an additional line of code would be a legitimate way to express the same thing without risk of hyperbole.

          Forthly, you say “we no longer have value semantics”, as if that mattered, but you don’t say why it matters.

          Value semantics make writing efficient code clearer. Naive C++ is inefficient because of the glut of temporaries created. Without value semantics, once must introduce other techniques to eliminate the temporaries such as non-const references (which are unclear from the caller’s perspective), expression templates, etc.

          I am not sure the degree of immutability introduced by c++ move concept can lead to the same optimizations as total immutability. Generally speaking, when something is mutable, the compiler can’t do the best optimizations it can do. I think total immutability leads to better optimizations.

          Unsubstantiated.

          It’s not ‘very bad’, unless you have a thread in your program that continuously modifies reference counts. The atomic increment/decrement overhead in real situations is negligible, and you are going to have it anyway, even in the presence of moving, if you want to use shared pointers and threads.

          There are many “real situations” in which manipulations of containers or objects occur frequently and would lead to reference count manipulations. A library writer cannot know client usage patterns a priori and must, therefore, make things as efficient as possible.

          I certainly will hate it when I come back to code after a month and, having forgotten a move somewhere, my code crashes without explanation. I will also hate it if another developer in the project silently introduces a move, thinking that the moved data are not used anywhere else, and then the code crashes again.

          That is the likely outcome of implicit move, but shouldn’t be the case for explicit move operations.

            Quote
        • I am not sure the degree of immutability introduced by c++ move concept can lead to the same optimizations as total immutability. Generally speaking, when something is mutable, the compiler can’t do the best optimizations it can do. I think total immutability leads to better optimizations.

          It is true that when the compiler can be absolutely sure that no data changes, it generally has an easier time optimizing code that’s written within those constraints. So if you want the best optimization results for a general pure-functional program I’ve no doubt that a compiler for a purely-functional language could produce better results than another. However:

          1. C++ is never going to be purely-functional
          2. Constraining oneself to total immutability makes some problems much harder to solve and/or solve efficiently
          3. The underlying machine model is mutable (we don’t generally use write-once memory) and the best solutions often must use that fact to avoid exploding in space or time. Sometimes a compiler can rewrite total immutability to take advantage of the underlying machine’s mutability, but not always: constraining the high-level program to immutability makes the compiler’s optimization job much harder in those cases.
          4. The way C++ is defined, most of the time you see const it doesn’t mean the thing can’t actually be changed, because the const is attached to a pointer or reference and the same object could be referred to as non-const elsewhere.
          5. Passing by value is in fact the mechanism in C++ that allows the compiler to assume there are no other references and thus apply many of the same kinds of optimizations one can make when one knows values are truly immutable
          6. Moving is the mechanism that makes passing by-value efficient.
            Quote
  11. Howard Hinnant says:

    Three nits about your std::remove pop-up:

    1. The moved from sequence isn’t 1 2 5 0 x, it is 1 2 5 x x

    2. C++03 doesn’t specify that assignment will be used, though that is the common implementation technique. swap could also be used to assign the new elements, and would even be beneficial in cases where swap is cheaper than assignment. Therefore the resultant C++03 sequence is also 1 2 5 x x, but ‘x’ means unspecified value in C++03.

    3. remove isn’t poorly specified (as asserted by Marc), nor is it the only std::algorithm with this characteristic (as implied by Marc). There is also remove_if and unique, all of which “shorten” sequences leaving zero or more unspecified values at the end.

      Quote
    • That’s two nits about the pop-up and one beef with Marc ;-)

      But why would remove ever swap with or move from that 2nd zero? I mean, I get that it’s allowed to, but I’d also like to know what’s realistic.

        Quote
    • Marc says:
      remove isn’t poorly specified (as asserted by Marc), nor is it the only std::algorithm with this characteristic (as implied by Marc).There is also remove_if and unique, all of which “shorten” sequences leaving zero or more unspecified values at the end.

      Sorry, that’s not exactly what I meant. It looked from the article that the problem came more from std::remove giving people too high expectations about the state of the result than from implicit move (“poorly specified” is really about the impression the article gave, not about the standard, sorry if I gave the impression I was attacking anyone’s work (and I am not attacking the article either, I am asking for more)). Thank you for pointing out unique as an other example (remove_if and remove count as one to me), I was precisely asking for the number of functions affected with my scope comment. And if it is only these two, I am wondering whether it wouldn’t be better to change their C++0X specfication (somehow force them to use copy or swap or non-implicit move) and possibly provide fast alternatives documented with a big warning sign.

      Now the throwing case is more worrying. Although I find it strange that elements involved in an algorithm that threw should be allowed to be used for anything beside affectation or destruction.

      I understand the goal to avoid for objects to ever be in a state they couldn’t have been in in C++03. It is just that to me, moved-from objects shouldn’t be used anyway, so unless they have some special destructor or affectation, it shouldn’t matter.

      Weighing the pros and cons is something you’ve both already done, I am just trying to catch up and expect I’ll end up agreeing with you. Defending the opposite position seems like a good way to get all the arguments in favor, although it’s hard to do without sounding aggressive or disparaging.

        Quote
      • “poorly specified” is really about the impression the article gave, not about the standard, sorry if I gave the impression I was attacking anyone’s work (and I am not attacking the article either, I am asking for more)…

        Could you be more specific? We aim to please!

        There’s no umbrage here, bro; please, keep it up! BTW, exception-safety guarantees for C++03 and move semantics for C++0x were both specifically designed not to create some special zombie state—it turns out you basically never need to do that, and if you do, in the end, you’re only punishing yourself.

          Quote
  12. Marc says:

    Is that really the best solution? I can understand not generating a move constructor when there is a destructor. But the examples with std::remove are not convincing at all. Are there other circumstances where it may fail? After reading this post, the impression I get is that there is one badly specified algorithm in the library that can be abused to cause “bad things”. My reading of the definition of std::remove is that it doesn’t guarantee anything about the extra elements and using them for anything other than destruction is UB. But even if I am wrong, we could just specify that std::remove is only allowed to copy (and create a well documented _move variant).

    I am perfectly willing to believe that there are real issues, but this paper fails to show their scope.

      Quote
    • There’s a recently-added expandable note just before tweak #2 that explains some of this; have you seen it?

        Quote
      • Marc says:

        Yes, I had seen it. As far as I understand, the expectation that the sequence ends in 0 5 is not guaranteed by C++03. And if I am wrong it would still be simpler to specify remove as not using moves but only copies.

        Now I had somehow missed the last paragraph about throwing during std::sort. I need to think about that, but I am not sure in what kind of state you expect to find your sequence if something (a move constructor? the predicate?) managed to throw during the sort. I can’t find any guarantee in the standard (the only occurrence of “throw” in section 25 is for qsort/bsearch).

          Quote
        • Yes, I had seen it. As far as I understand, the expectation that the sequence ends in 0 5 is not guaranteed by C++03.

          Correct. However, once the elements are user-defined types, the library can’t use any operations on them other than those defined in the algorithm requirements, so the values would be, at worst, something within the invariant maintained by the existing copy assignment operator (let’s not discuss abominations like a mutating operator==). So afterwards, those are the only states you can observe, in C++03. In C++0x, you’ll also be able to observe the moved-from state, which might be different.

          And if I am wrong it would still be simpler to specify remove as not using moves but only copies.

          …and way, way slower.

          Now I had somehow missed the last paragraph about throwing during std::sort. I need to think about that, but I am not sure in what kind of state you expect to find your sequence if something (a move constructor? the predicate?) managed to throw during the sort. I can’t find any guarantee in the standard (the only occurrence of “throw” in section 25 is for qsort/bsearch).

          Don’t worry, the guarantees are in there; I saw to it personally ;-) . Again, in C++03, the only states available to elements of user-defined type are those that can be reached through the requirements of the algorithm.

            Quote
  13. Andrzej Krzemienski says:

    Hi, I believe someone in comp.std.c++ suggested that the default move constructor should be implemented as a swap. This would keep the invariants. Is that not feasible? Or would it create a loop where default move calls swap, and default swap calls move?

    Regards, &rzej

      Quote
    • It would keep the invariants, provided you had another object (in a good state) to swap into. Remember, construction makes new objects. So then, presumably, you need a default constructor… which may, or may not, exist and set up the required state.

        Quote
      • Sean Hunt says:

        In my opinion, this isn’t fundamentally a problem with implicit move generation, it’s a weakening of the results from functions that move. We no longer guarantee a valid object will be left, but only that a moved-from object will be left. I personally believe it is fine to say that this may break an invariant in a class that is not expecting it. The only caveat is that if there is a user-declared destructor, this function will be invoked and may rely on an invariant, and thus we should not make any assumptions and so not generate a move constructor.

        In other words, I think we don’t need to maintain backwards-compatibility in pathological cases in favor of adding a useful language feature. We can easily-enough add specification to algorithms that move that if they can’t move, they copy, and then say “Oh, they changed std::remove so that it doesn’t guarantee valid state past the end of the new array by default. Here’s the two lines you need to make it work.” where the one line is deleting the move assignment operator and constructor.

          Quote
        • In my opinion, this isn’t fundamentally a problem with implicit move generation, it’s a weakening of the results from functions that move. We no longer guarantee a valid object will be left, but only that a moved-from object will be left.

          The fundamental problem is that if you let the compiler write the move constructor, a moved-from object may not be a valid object.

          I personally believe it is fine to say that this may break an invariant in a class that is not expecting it. The only caveat is that if there is a user-declared destructor, this function will be invoked and may rely on an invariant, and thus we should not make any assumptions and so not generate a move constructor.

          That doesn’t account for algorithms (and container member functions that use move). Often the next operation on a moved-from object is assignment, not destruction.

          Also, I don’t understand peoples’ interest in user-declared destructors. Ultimately, the surest (though still imperfect) sign that the author intended an invariant that could be broken by a generated move operation is a user-declared constructor.

          In other words, I think we don’t need to maintain backwards-compatibility in pathological cases in favor of adding a useful language feature.

          Even if you hold that point-of-view (and not everyone does), the problem is that we have very little to go on in making a judgement about whether the cases that would be broken merit the label “pathological.” This close to finalization of the standard, we’ve only just now noticed this issue. Even if you think all the examples in the paper are “pathological” (I don’t), the chances are pretty good that there are non-pathological cases too.

          We can easily-enough add specification to algorithms that move that if they can’t move, they copy,

          That’s already in the specification.

          and then say “Oh, they changed std::remove so that it doesn’t guarantee valid state past the end of the new array by default. Here’s the two lines you need to make it work.” where the one line is deleting the move assignment operator and constructor.

          Some may be comfortable saying that upgrading your working C++03 code to C++0x can introduce “invalid states” (essentially, undefined behavior), but I am not.

            Quote
    • Joe Gottman says:

      If we were to do that, then how would we define the default move-assignment operator? Not as a simple swap of the source and target objects, because the target object might own subobjects that need to be disposed of. Maybe default-construct a temporary object, swap it with the target, then swap the target with the source. Then after the move-assignment the source contains a default-constructed object, the target contains the state that was originally in the source, and the temporary object, which is about to be destructed, contains the state that was originally in the target.

        Quote
  14. Sebastian says:

    I was the one who “proposed” a tweak that removes implicit moves in case any special member function is user-declared. But even then, your std::remove example would break. Good catch!

    In a related matter (still rvalue references) can you point out what bullet point in the draft handles the situation you have if you push_back a string literal on a vector of std::string? Which overload will the compiler pick? On one hand the types are not reference-related and there will be a conversion yielding a temporary string object. On the other hand the initializer (string literal) is an lvalue expression. It seems that situations like these should lead to the push_back(string&&) overload being picked. But that’s not what the current draft seems to dictate.

      Quote
  15. Sebastian Redl says:

    There is another reason that these tweaks are unacceptable, besides that they don’t solve the problem in edge cases: they are extremely unintuitive! Having move constructors be auto-generated except when is not teachable, not usable, and would just look ridiculous. Make =default work for move constructors and move assignment and never auto-generate it, thus having a simple rule in the language that can be learned.

    C++ has far too many edge cases already; it’s one of the biggest problems of the language.

      Quote
  16. MT-Wizard says:

    I see two problems in the samples you provided:

    1. New implicit function can break invariants that class is trying to maintain in all other functions. This can be worked around in the way you said: suppress implicit move for classes with default dtors.

    2. POD types aren’t zeroed-out on move. All other samples fall under this category. If they were zeroed after move, no problems appear (except of really weird examples).

    So IMHO there are still ways to keep implicit move in C++0x. Without this move would be used only in standard library classes as it is really too large amount of work to be done to add move ctors to all classes in a large application

      Quote
    • Suppress implicit move for classes with default destructors? Now that’s an approach I didn’t think to explore! But it wouldn’t work even for the first example in this article. As for the opposite approach, it doesn’t work for any of the other examples. I thought the article demonstrated all of that pretty clearly, but if you can’t see it, I must have some explaining left to do… but I just don’t know what else I need to say.

      I don’t know what you mean about POD types being zeroed, but I can promise you, that doesn’t solve anything fundamentally. There’s no a priori reason to think that a zero value lies within any given class’s invariant.

        Quote
  17. Maybe it would make sense to roll this together with the base_check functionality? That already changes the semantics of classes in a backwards incompatible way, and I think they are looking for a keyword, so it might make sense to activate both semantic changes with the same keyword.

    def_class {}; def_struct {};

    Could create “new style” classes and structs with both a default move constructor, and support for new override semantics that base_check provides.

    Another change that would be good to throw in would be to not make single argument constructors implicit by default. Instead, make them explicit, and have a implicit keyword for cases where you want that behavior.

    Probably this is too big a change for this point in the standardization process though…

      Quote

Leave a Comment (post replies using links below individual comments)