C++ is hard, the newer versions become even harder. This article will deal with some of the hard parts in C++, rvalue, rvalue reference ( ) and move semantics. And I am going to reverse engineer (not a metaphor) these complex and correlated topics, so you can understand them completely in one shot. **&&** Firstly, let’s examine What is a rvalue? A ** **value is one that should be on the right side of an equals sign. r Example: int var; // too much JavaScript recently:)var = 8; // OK! lvalue (yes, there is a lvalue) on the left8 = var; // ERROR! rvalue on the left(var + 1) = 8; // ERROR! rvalue on the left Simple enough. Then let’s look at a more subtle case, a ** **values returned by functions: r #include <string>#include <stdio.h>int g_var = 8;int& returnALvalue() { return g_var; //here we return a lvalue}int returnARvalue() { return g_var; //here we return a rvalue}int main() { printf("%d", returnALvalue()++); // g_var += 1; printf("%d", returnARvalue());} Result: 89 It is worth noting that the way of returning a (in the example) is considered a bad practice. So do not do that in real world . lvalue programming Beyond theoretical level Whether a variable is a value can make differences in real programming even before is invented. r **&&** For example, this line const int& var = 8; can be compiled fine while this: int& var = 8; // use a lvalue reference for a rvalue generates following error: rvalue.cc:24:6: error: non-const lvalue reference to type 'int' cannot bind to atemporary of type 'int' The error message means that the compiler enforces a for value. const reference r A more interesting example: #include <stdio.h>#include <string>void print(const std::string& name) { printf("rvalue detected:%s\n", name.c_str());} void print(std::string& name) { printf("lvalue detected:%s\n", name.c_str());} int main() { std::string name = "lvalue"; print(name); //compiler can detect the right function for lvalue print(rvalu + "e"); // likewise for rvalue} Result: lvalue detected:lvaluervalue detected:rvalue The difference is actually significant enough and compiler can determine overloaded functions. So rvalue is constant value? Not exactly. And this where ( value reference)comes in. **&&** r Example: #include <stdio.h>#include <string> void print(const std::string& name) {printf(“const value detected:%s\n”, name.c_str());} void print(std::string& name) {printf(“lvalue detected%s\n”, name.c_str());} void print(std::string&& name) {printf(“rvalue detected:%s\n”, name.c_str());} int main() {std::string name = “lvalue”;const std::string cname = “cvalue”;std::string rvalu = "rvalu"; print(name);print(cname);print( );} rvalu + "e" Result: lvalue detected:lvalueconst value detected:cvaluervalue detected:rvalue If the functions are overloaded for value, a value variable choose the more specified version over the version takes a const reference parameter that is compatible for both. Thus, r r **&&** can further diversify rvalue from const value. In bellow I summarize the compatibility of overloaded function versions to different types in default setting. You can verify the result by selectively commenting out lines in the example above. It sounds cool to further differentiate value and constant value as they are not exactly the same indeed. But what is the practical value? r What problem does && solve exactly? The problem is the unnecessary deep copy when the argument is a value. r To be more specific. notation is provided to specify a value, which can be used to avoid the deep copy when the value, 1) is passed as an argument of either a or an , and 2) the class of which contains a pointer (or pointers) referring to dynamically allocated resource (memory). **&&** r r constructor assignment operator It can be more specific with examples: #include <stdio.h>#include <string>#include <algorithm>using namespace std;class ResourceOwner {public: ResourceOwner(const char res[]) { theResource = new string(res); } ResourceOwner(const ResourceOwner& other) { printf("copy %s\n", other.theResource->c_str()); theResource = new string(other.theResource->c_str()); } ResourceOwner& operator=(const ResourceOwner& other) { ResourceOwner tmp(other); swap(theResource, tmp.theResource); printf("assign %s\n", other.theResource->c_str()); } ~ResourceOwner() { if (theResource) { printf("destructor %s\n", theResource->c_str()); delete theResource; } }private: string* theResource;}; void testCopy() { // case 1 printf("=====start testCopy()=====\n"); ResourceOwner res1("res1"); ResourceOwner res2 = res1; //copy res1 printf("=====destructors for stack vars, ignore=====\n");} void testAssign() { // case 2 printf("=====start testAssign()=====\n"); ResourceOwner res1("res1"); ResourceOwner res2("res2"); res2 = res1; //copy res1, assign res1, destrctor res2 printf("=====destructors for stack vars, ignore=====\n");} void testRValue() { testRValue ResourceOwner res2("res2");res2 = ResourceOwner("res1"); //copy res1, assign res1, destructor res2, destructor res1 // case 3 printf("=====start ()=====\n"); printf("=====destructors for stack vars, ignore=====\n"); int main() { testCopy(); testAssign(); Result: = =copy res1= =destructor res1destructor res1= =copy res1assign res1destructor res2= =destructor res1destructor res1= =copy res1assign res1destructor res2destructor res1= =destructor res1 start testCopy() destructors for stack vars, ignore start testAssign() destructors for stack vars, ignore start testRValue() destructors for stack vars, ignore The result are all good for the first two test cases, i.e., and , in which in is copied for the . It is reasonable to copy the resource because they are two entities both need their unshared resource (a string). testCopy() testAssign() resource res1 res2 However, in the third case, the (deep) copying of the resource in is superfluous because the anonymous value (returned by ) will be destructed right after the assignment thus it does not need the resource anymore: res1 r ResourceOwner(“res1”) res2 = ResourceOwner("res1"); // Please note that the is called right after this line before the point where stack variables are destructed. destructor res1 I think it is a good chance to repeat the problem statement: notation is provided to specify a value, which can be used to avoid the deep copy when the value, 1) is passed as an argument of either a constructor or an assignment operator, and 2) the class of which contains a pointer (or pointers) referring to dynamically allocated resource (memory). **&&** r r If copying of a resource that is about to disappear is not optimal, what is the right operation then? The answer is Move The idea is pretty straightforward, if the argument is a value, we do not need to . Rather, we can simply “move” the resource (that is the memory the value points to). Now let’s overload the using the new technique: r copy r assignment operator ResourceOwner& operator=(ResourceOwner&& other) {theResource = other.theResource;other.theResource = NULL;} This new is called a . And a can be programmed in a similar way. assignment operator move assignment operator move constructor A good way of understanding this is: when you sell your old property and to a new house, you do not have to toss all the furniture as we did in case 3 right? Rather, you can simply the furniture to the new home. move move All good. What is std::move? Besides the and discussed above, there is one last missing piece in this puzzle, . move assignment operator move constructor std::move Again, we look at the problem first: when 1) we know a variable is in fact a value, while 2) the compiler does not. The right version of the overloaded functions can not be called. r A common case is when we add another layer of resource owner, and the relation of the three entities is given as bellow: ResourceHolder holder||----->owner||----->resource (N.b., in the following example, I complete the implementation of ’s as well) ResourceOwner move constructor Example: #include <string>#include <algorithm> using namespace std; class ResourceOwner { public:ResourceOwner(const char res[]) {theResource = new string(res);} ResourceOwner(const ResourceOwner& other) {printf(“copy %s\n”, other.theResource->c_str());theResource = new string(other.theResource->c_str());} ++ResourceOwner(ResourceOwner&& other) {++ printf(“move cons %s\n”, other.theResource->c_str());++ theResource = other.theResource;++ other.theResource = NULL;++} ResourceOwner& operator=(const ResourceOwner& other) {ResourceOwner tmp(other);swap(theResource, tmp.theResource);printf(“assign %s\n”, other.theResource->c_str());} ++ResourceOwner& operator=(ResourceOwner&& other) {++ printf(“move assign %s\n”, other.theResource->c_str());++ theResource = other.theResource;++ other.theResource = NULL;++} ~ResourceOwner() {if (theResource) {printf(“destructor %s\n”, theResource->c_str());delete theResource;}} private:string* theResource;}; class ResourceHolder { …… ResourceHolder& operator=(ResourceHolder&& other) {printf(“move assign %s\n”, other.theResource->c_str());resOwner = other.resOwner;} …… private:ResourceOwner resOwner;} In ’s , we want to call ’s since “a no-pointer member of a value should be a value too”. However, when we simply code , what get invoked is actually the ’s normal that, again, incurs the extra copy. ResourceHolder move assignment operator ResourceOwner move assignment operator r r resOwner = other.resOwner ResourceOwner assignment operator It’s a good chance to repeat the problem statement again: when 1) we know a variable is in fact a value, while 2) the compiler does not. The right version of the overloaded functions can not be called. r As a solution we use to to cast the variable to value, so the right version of ’s can be called. std::move r ResourceOwner assignment operator ResourceHolder& operator=(ResourceHolder&& other) {printf(“move assign %s\n”, other.theResource->c_str());resOwner = std::move(other.resOwner);} What is std::move exactly? We know that is not simply a compiler placebo telling a compiler that “I know what I am doing”. It effectively generate instructions of a value to bigger or smaller registers (e.g., -> ) to conduct the “cast”. type cast mov %eax %cl So what does exactly behind scene. I do not know myself when I am writing this paragraph, so let’s find out together. std::move First we modify the main a bit (I tried to make the style consistent) Example: int main() {ResourceOwner res(“res1”);asm(“nop”); // remeber meResourceOwner && rvalue = std::move(res);asm(“nop”); // remeber me} Compile it, and dissemble the obj using clang++ -g -c -std=c++11 -stdlib=libc++ -Weverything move.ccgobjdump -d -D move.o Result: 0000000000000000 <_main>:0: 55 push %rbp1: 48 89 e5 mov %rsp,%rbp4: 48 83 ec 20 sub $0x20,%rsp8: 48 8d 7d f0 lea -0x10(%rbp),%rdic: 48 8d 35 41 03 00 00 lea 0x341(%rip),%rsi # 354 <GCC_except_table5+0x18>13: e8 00 00 00 00 callq 18 <_main+0x18>18: 90 nop // remember me19: 48 8d 75 f0 lea -0x10(%rbp),%rsi1d: 48 89 75 f8 mov %rsi,-0x8(%rbp)21: 48 8b 75 f8 mov -0x8(%rbp),%rsi25: 48 89 75 e8 mov %rsi,-0x18(%rbp)29: 90 nop // remember me2a: 48 8d 7d f0 lea -0x10(%rbp),%rdi2e: e8 00 00 00 00 callq 33 <_main+0x33>33: 31 c0 xor %eax,%eax35: 48 83 c4 20 add $0x20,%rsp39: 5d pop %rbp3a: c3 retq3b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) I briefly explain what happens between the two . nop assign the address of one stack variable (presumably ) to ResourceOwner res %rsi assign the value of to another stack variable (this one is anonymous) %rsi assign the value of the anonymous stack variable back to (what?) %rsi assign the value of %rsi to yet another stack variable (presumably ) ResourceOwner && rvalue so the whole operations can be summarized as “assigning the address of to “, which is the same to normal reference assignment. ResourceOwner res ResourceOwner && rvalue If we turn on for the compiler, all those dummy instructions will be gone. O (-O1) clang++ -g -c -O1 -std=c++11 -stdlib=libc++ -Weverything move.ccgobjdump -d -D move.o Moreover, if changing the critical line to a normal reference assignment: ResourceOwner & rvalue = res; Except for some minor differences in variables offsets, the assembly code generated are mostly identical, as we assumed in point 5 above. The test shows that the semantics is pure syntax candy and a machine does not care at all. move To conclude, If you like this read please clap for it or follow me by clicking the button. Thanks for coming along and hope to see you the next time. This post is also archived . here