EDIT: A lot of readers seem more interested in my implementation of the String class than the purpose of this post, which is to show a somewhat remarkable fact about how small string optimization interacts with move operations. Please don't post more "This implementation is not the most efficient". You are missing the point. The example implementation of my string class here is neither complete, efficient, nor representative of STL implementations. It is only used to illustrate the idea behind small string optimization. I don't even know if the code actually compiles.
Typically, std::string is implemented using small string optimization, a technique by which small strings are stored directly inside the object, rather than dynamically allocated on the heap.
Before I start discussing the issue with small string optimization, let me give you a quick introduction to how it works. Let's say we want to implement a very simplified string class ourselves. In each String object, we simply keep a pointer to the beginning and to the terminating null character of the char array (std::string also needs to keep a pointer to an allocator, as you are allowed to specify your own memory management).
We can implement these constructors in a straightforward manner, using low level programming for efficiency:
(The implementation will be slightly faster if you replace new/delete with malloc/free,as there is no need to call any constructors/destructors for char arrays. In theory, compilers can optimize this away; in practice, they don't.) See this post for a discussion on this issue (unrelated to small string optimization and move operations).
So far so good, but now you realize that you can do something clever: since we need to a char* pointers in every string, why not use that space to store small strings of length up to sizeof(char*)? That way, we won't need to call new/delete for small strings. In fact, since strings are in most applications quite short, why not store a little more while we're at it?
This is known as small string optimization. One way we can efficiently and elegantly implement this is by using tagged unions. Let's try it out:
The constructor for const char* is not that hard:
Typically, std::string is implemented using small string optimization, a technique by which small strings are stored directly inside the object, rather than dynamically allocated on the heap.
Before I start discussing the issue with small string optimization, let me give you a quick introduction to how it works. Let's say we want to implement a very simplified string class ourselves. In each String object, we simply keep a pointer to the beginning and to the terminating null character of the char array (std::string also needs to keep a pointer to an allocator, as you are allowed to specify your own memory management).
class String { public: String(const char* p); String(const String& s); String(String&& s); ~String(); unsigned length() const { return _size; } char* begin() { return _beg; } char* end() { return begin() + _size; } // ... private: char* _beg; // first character in array unsigned _size; // excluding terminating null char };
We can implement these constructors in a straightforward manner, using low level programming for efficiency:
inline String::String(const char* p) : _size(strlen(p)) { _beg(new char[_size+1]); memcpy(_beg,p,_size+1); } inline String::String(const String& s) : _size(s.length()) { _beg = new char[_size+1]; memcpy(_beg,s._beg,len+1); } inline String::String(String&& s) : _size(s.length()) { _beg = s._beg; s._beg = nullptr; // zero out } inline String::~String() { delete [] _beg; }
(The implementation will be slightly faster if you replace new/delete with malloc/free,
So far so good, but now you realize that you can do something clever: since we need to a char* pointers in every string, why not use that space to store small strings of length up to sizeof(char*)? That way, we won't need to call new/delete for small strings. In fact, since strings are in most applications quite short, why not store a little more while we're at it?
This is known as small string optimization. One way we can efficiently and elegantly implement this is by using tagged unions. Let's try it out:
class SString { public: SString(const char* p); SString(const String& s); SString(String&& s); ~SString(); unsigned length() const { return _size; } char* begin() { return _size < 16 ? str : _beg; } char* end() { return begin() + _size; } // ... private: unsigned _size; union { char* _beg; char str[16]; }; };
Note how SString::begin() was defined to either return _beg or str, depending on the size of the string.
The constructor for const char* is not that hard:
inline SString::SString(const char* p) : _size(strlen(p)) { if (_size < 16) { // Small string optimization: store in static str memcpy(str,p,_size+1); } else { // Dynamically allocate string (like before) _beg(new char[_size+1]); memcpy(_beg,p,_size+1); } }
The copy constructor is pretty much the same:
inline SString::SString(const SString& s) : _size(s.length()) { if (_size < 16) { // Small string optimization: store in static str memcpy(str,s.str,_size+1); } else { // Dynamically allocate string (like before) _beg(new char[_size+1]); memcpy(_beg,s._beg,_size+1); } }
Now comes the interesting part where you'll see the trade off when using small string optimization. Here is the move constructor:
inline SString::SString(SString&& s) : _size(s.length()) { if (_size < 16) { // ??? } else { _beg = s._beg; // steal resource from s s._beg = nullptr; // prevent s from deleting on destruction } }
So when we have a dynamically allocated array, we can steal the resource. When we've used the small string optimization, on the otherhand, there is no way to steal the resource, as it is statically allocated. We must hence copy it:
if (_size < 16) { memcpy(str,s.str,_size+1); } else {
What may be surprising to a string user, is that the performance degradation occurs only for small strings.
This effect can be seen on MSVC 2010, which uses a small string optimization of 16 characters for std::string. We take a command line string, then move the string between two string objects 10 million times.
This effect can be seen on MSVC 2010, which uses a small string optimization of 16 characters for std::string. We take a command line string, then move the string between two string objects 10 million times.
Here's the result:
With the a string of 16 and 17 characters, including the null terminating char, the execution time in seconds is:
Length Time
16 0.171
17 0.06
In other words, the small string optimization, which is now the small string deterioration, is about 3 times slower than copying pointers. This leads to the surprising conclusion that when moving strings, it is actually faster to move larger strings.
Note that most of the time, you don't just move around strings, but allocate them, deallocate them, copy them, etc. In all those cases, of course, small string optimization is very useful.