Std::string usage in cocos2d-x 3.0

Hi!

Why is const char* replaced with std::string in all method arguments of cocos2d-x 3.0?
I understand than sometimes receiving a std::string can be useful, but I suppose that for most common usages performance is going to be a lot worse because unnecessary extra memory allocations.

Thanks!

It wouldn’t be better to use a class like this?
std::string’s data is allocated only when it’s necessary:

namespace cocos2d
{

class String
{

public:
    explicit String(const char* charArray) :
        mString(),
        mCharArray(charArray)
    {
    }

    explicit String(const std::string& string) :
        mString(string),
        mCharArray(NULL)
    {
    }

    const char* getCharArray() const
    {
        if(mCharArray != NULL)
        {
            return mCharArray;
        }

        return mString.c_str();
    }

    std::string& getStdString()
    {
        if(mCharArray != NULL)
        {
            mString.append(mCharArray);
            mCharArray = NULL;
        }

        return mString;
    }

private:
    std::string mString;
    const char* mCharArray;
};

} // namespace cocos2d

@Chano:
It’s not a big deal due to move semantics, perfect forwarding, copy elision and small string optimization.

Your class provides yet another layer of indirection and makes it harder for the optimizer to inline calls, you have not implemented move-friendly methods. And the memory usage is going to increase.

@kuhar wrote:

It’s not a big deal due to move semantics, perfect forwarding, copy elision and small string optimization.

Really? For example, in this common (I think) code there’s 1000 unnecessary memory allocations:

for(int index = 0; index < 1000; index++)
{
    Sprite* sprite = Sprite::createWithSpriteFrameName("whatever");
}

@kuhar wrote:

Your class provides yet another layer of indirection and makes it harder for the optimizer to inline calls, you have not implemented move-friendly methods.

I think the example class is too short and simple to have problems with inlining.

What I posted is a quick example to get the idea, so some move-friendly methods would be great.
Anyway, the example class already has move semantics, because move constructor and assignment operator is added automatically:
http://stackoverflow.com/questions/8283589/are-move-constructors-produced-automatically

@kuhar wrote:

And the memory usage is going to increase.

The size of and empty std::string and a const char* pointer is going to be lower than the size of a non-empty std::string, and memory fragmentation is avoided until it’s necessary.

The point is, how often cocos2d-x classes manipulates the text stored in a std::string?
I think if the answer is not much, a solution like my example class is going to perform better.

@Chano wrote:

@kuhar wrote:

It’s not a big deal due to move semantics, perfect forwarding, copy elision and small string optimization.

Really? For example, in this common (I think) code there’s 1000 unnecessary memory allocations:

for(int index = 0; index < 1000; index++)
{
    Sprite* sprite = Sprite::createWithSpriteFrameName("whatever");
}

I have a suggestion for you: write such benchmark and publish the result here. Please remember to test it on both x86 and ARM. I’m really curious about the real difference between these two solutions.
Otherwise no one is going to take your concerns seriously.

edit:

Anyway, the example class already has move semantics, because move constructor and assignment operator is added automatically

Which is not supported by visual studio (vc 12).

Well, I have done a test :wink:

I have created two classes, StdStringObject (which stores a std::string) and TestStringObject (which stores a TestString).
I have let the objects leak to measure used memory.

Currently I don’t have a working cocos2d-x environment, so I have done the test in a Qt one (sorry).
So on a Mac Mini with a 2.3 Ghz Intel Core i5, I get the following results:

StdStringObject: 173ms, 74.0MB
TestStringObject: 65ms, 27.5MB

That’s a performance boost of 170%!

class TestString
{

public:
    TestString(const char* charArray) :
        mString(),
        mCharArray(charArray)
    {
    }

    TestString(const std::string& string) :
        mString(string),
        mCharArray(NULL)
    {
    }

    TestString(std::string&& string) :
        mString(std::move(string)),
        mCharArray(NULL)
    {
    }

    TestString(const TestString& other) :
        mString(other.mString),
        mCharArray(other.mCharArray)
    {
    }

    TestString(TestString&& other) :
        mString(std::move(other.mString)),
        mCharArray(other.mCharArray)
    {
    }

    TestString& operator=(const TestString& other)
    {
        if(this != &other)
        {
            mString = other.mString;
            mCharArray = other.mCharArray;
        }

        return *this;
    }

    TestString& operator=(TestString&& other)
    {
        if(this != &other)
        {
            mString = std::move(other.mString);
            mCharArray = other.mCharArray;
        }

        return *this;
    }

    const char* getCharArray() const
    {
        if(mCharArray != NULL)
        {
            return mCharArray;
        }

        return mString.c_str();
    }

    std::string& getStdString()
    {
        if(mCharArray != NULL)
        {
            mString.append(mCharArray);
            mCharArray = NULL;
        }

        return mString;
    }

private:
    std::string mString;
    const char* mCharArray;
};


class StdStringObject
{

public:
    explicit StdStringObject(const std::string& stdString) :
        mStdString(stdString)
    {
    }

private:
    std::string mStdString;
};


class TestStringObject
{

public:
    explicit TestStringObject(const TestString& testString) :
        mTestString(testString)
    {
    }

private:
    TestString mTestString;
};


int main(int argc, char *argv[])
{
    QElapsedTimer timer;
    timer.start();

    for(int index = 0; index < 1000000; index++)
    {
        new StdStringObject("whatever");
        //new TestStringObject("whatever");
    }

    qDebug() << timer.elapsed();

    return 0;
}

That test is very simple, it does not randomize entered string, especially the strings’ lengths are always the same. The strings are also never modified.

What compiler did you use? gcc with -O3?

And the thing that might be worth trying IMHO would be to emulate old COW (copy on write) mechanism: have separate constructors for cstring and std::string, that would copy the actual string only upon the first modification. However, I’m not sure about detecting original string’s modification from outside of this string wrapper… What do you think of this?

If the string argument is going to be modified often, it’s better to use std::string. But I don’t think that cocos2d-x needs to do that much.

String randomization is not going to change the fact that with each new std::string an extra memory allocation is going to occur.

In fact, the test is favorable to std::string since the word “whatever” has few characters. If you test the example with for example a file path or an url the performance degradation is much worse.

C++11 standard doesn’t allow std::string implementations to have copy on write anymore:
http://stackoverflow.com/questions/12199710/legality-of-cow-stdstring-implementation-in-c11

I have run the test with clang. I’ll run it with Visual Studio and gcc to see what happens.

@Chano wrote:

String randomization is not going to change the fact that with each new std::string an extra memory allocation is going to occur.

But the small string optimization might not kick in with longer strings.

C++11 standard doesn’t allow std::string implementations to have copy on write anymore:
http://stackoverflow.com/questions/12199710/legality-of-cow-stdstring-implementation-in-c11

I suggested emulating it, being aware of the Standard’s position on COW.

With gcc and Visual Studio the performance difference is much lower, but it’s still there.
I have also run the test with a 40 chars string:

Visual Studio 2012 update 4 (release mode):

8 chars:
74ms 60ms 23%

40 chars:
217ms 60ms 261%

gcc 4.8 (-o2)

8 chars:
122ms 112ms 9%

40 chars:
140ms 112ms 25%

Well, I have done a more complete test with a new object wich only stores the char array to factor out test memory allocations.

The results:

clang 4.1 (-o2)

                    10char  20char  30char  40char
CharArrayObject     59      59      59      59
StdStringObject     82      82      155     166
TestStringObject    72      72      72      72
Performance inc.    77%     77%     638%    723%


gcc 4.8 (-o2)

                    10char  20char  30char  40char
CharArrayObject     43      43      43      43
StdStringObject     199     201     204     212
TestStringObject    177     177     177     177
Performance inc.    16%     18%     20%     26%


vs2012 (release)

                    10char  20char  30char  40char
CharArrayObject     43      43      43      43
StdStringObject     66      123     124     131
TestStringObject    59      59      59      59
Performance inc.    44%     400%    400%    450%

Conclusions:

  • Performance increase using TestString is almost always 50% with any text size.
  • With clang, performance degradation using std::string is just too bad (600%) when the text has more than 20 chars.
  • With vs2012, performance degradation using std::string is just too bad (400%) when the text has more than 10 chars.
  • With gcc, even storing an empty std::string is terrible (storing just the char array gives a 311% performance boost).

I think that my test string should be good enough for clang and vs2012, but for gcc the better option would be to avoid std::string and use a custom string that doesn’t allocate memory if it’s not needed.

Any comments from the cocos2d-x team?

class CharArrayObject
{

public:
    explicit CharArrayObject(const char* charArray) :
        mCharArray(charArray)
    {
    }

private:
    const char* mCharArray;
};


class StdStringObject
{

public:
    explicit StdStringObject(const std::string& stdString) :
        mStdString(stdString)
    {
    }

    explicit StdStringObject(std::string&& stdString) :
        mStdString(std::move(stdString))
    {
    }

private:
    std::string mStdString;
};


class TestStringObject
{

public:
    explicit TestStringObject(const TestString& testString) :
        mTestString(testString)
    {
    }

    explicit TestStringObject(TestString&& testString) :
        mTestString(std::move(testString))
    {
    }

private:
    TestString mTestString;
};


int main(int, char**)
{
    QElapsedTimer timer;
    timer.start();

    const char* charArray = "0123456789";
    for(int index = 0; index < 1000000; index++)
    {
        new CharArrayObject(charArray);
        //new StdStringObject(charArray);
        //new TestStringObject(charArray);
    }

    qDebug() << timer.elapsed();
    return 0;
}

Anyone?

Did you enable c++11?

Yes!

And besides, as far as I know, move semantics are not available if C++11 is disabled…

@Chano:

Your idea actually resembles me c++14 (rejected) proposal: std::string_view http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3609.html

If I have understand it right, string_view doesn’t own the string, it’s just a reference.

What I propose is a string implementation that doesn’t allocate memory if it’s not needed.
The TestString class is not enough for gcc.

cocos2d-x v3.x requires to enable c++11, or it can not work. Because it uses many c++11 features, such as auto, nullptr, std::thread and so on.

And we change all API to use std::string is because we want to make it consistent. Yep, it may bring some overhead, but is it a bottleneck of a game? Now our bottleneck is over draw, which is hard to fix because 2d games use translucent textures.

I understand that string overhead is much less important than draw overhead in a (2d) game engine, and available work resources should be spent there.

In a game that manages a lot of text, we have these options:

  • We store every text in a std::string and waste a lot of memory.
  • We create a std::string every time a cocos2d-x call contains a text argument, allocating (and deallocating) memory with each call.
  • We manage std::string instances in a way that memory allocations are at a minimum.
  • We try to avoid cocos2d-x calls which contains std::string arguments :stuck_out_tongue:

It’s common in game engines to have string implementations that doesn’t allocate memory if it’s not needed.
For example, EASTL string implementation doesn’t allocate memory with the default constructor:

It would be great if at least the cocos2d-x string implementation could be customizable by the user with a define or a typedef or something :stuck_out_tongue:

Anyway, thanks a lot for listening :slight_smile:

Thanks for the feedback. It is a hard balance between performance and consistency. More options mean more confused. And it is hard to balance between newbies and senior engineers.