Cocos2d-x Windows performance issues examined

This post is meant to examine some of the causes that may be affecting performance on Windows, specifically win32 applications. Many users report graphical stuttering, poor frame rates, jitter or various display issues that should not occur when running even a simple program on high-end hardware. Many of these issues are not intuitive and required a lot of digging into the engine code and searching for solutions. My aim is to bring these issues up so we can examine them, look at some of the proposed solutions, and hopefully integrate them into the next Cocos2d-x version. I hope users can find this post informative and hope that they will not give up on the engine after running into their first performance problem.

The first thing to look at is the main game loop found in cocos\platform\win32\CCApplication-win32.cpp

while(!glview->windowShouldClose())
{
QueryPerformanceCounter(&nNow);
if (nNow.QuadPart -
nLast.QuadPart > _animationInterval.QuadPart)
{
nLast.QuadPart
= nNow.QuadPart - (nNow.QuadPart % _animationInterval.QuadPart);
director->mainLoop();
glview->pollEvents();
}
else
{
Sleep(1);
}
}

QueryPerformanceCounter, and the animationInterval timing are ok, most performance issues come from the Sleep(1) command. I noticed when I printed out the delta time (dt) in the update method, I was seeing a wide distribution of delta times from 2ms to 33ms. I ran a trial of the empty test project which basically just shows an image, collected the dtā€™s of about 1000 frames and sorted the results.

As you can see the distribution is all over the place, with the bulk at around the 16.6 millisecond mark that should occur when running at a 60fps animation interval. Problems and graphical disturbances start to show when you go over this mark as it could result in missing the monitors refresh cycle and will appear as a missed or duplicate frame.

After some research I discovered that Sleep(1) does not guarantee that the process will sleep for 1
millisecond. The amount of time it sleeps is determined by the Windows Timer Resolution. By default the Windows Timer Resolution is set to 15.6 msec, which is an eternity if you are trying to render at 60fps or higher. You can set the timer resolution to the value you desire in code or with a timer tool that can be found at

To set the timer resolution in your code, as described by https://msdn.microsoft.com/en-us/library/windows/desktop/dd743626(v=vs.85).aspx

I put this in the int Application::run() method in cocos\platform\win32\CCApplication-win32.cpp before the
game loop

///////////////////////////////////////////////////////////////////////////
/////////////// changing timer resolution
///////////////////////////////////////////////////////////////////////////
UINT TARGET_RESOLUTION = 1; // 1-millisecond target resolution
TIMECAPS tc;
UINT wTimerRes;
if (timeGetDevCaps(&tc, sizeof(TIMECAPS)) != TIMERR_NOERROR)
{
// Error; application canā€™t continue.
}
wTimerRes = std::min(std::max(tc.wPeriodMin, TARGET_RESOLUTION), tc.wPeriodMax);
timeBeginPeriod(wTimerRes);

And to play nice and restore the timer to where it was before running your program, place this after the game loop

///////////////////////////////////////////////////////////////////////////
/////////////// restoring timer resolution
///////////////////////////////////////////////////////////////////////////
wTimerRes = std::min(std::max(tc.wPeriodMin, TARGET_RESOLUTION), tc.wPeriodMax);
timeEndPeriod(wTimerRes);

Even commercial games do not seem to get this right, as you can see using the timer tool to set the resolution lower almost doubles the frame rate in a commercial game released about a year ago.

Windows Timer Resolution -
More Fps in Evolve - YouTube

Setting the timer to 1msec makes the distribution of frame times much closer to the desired 16.6 milliseconds (if using 60fps animation interval). There are some outliers but that is the nature of handing over control to the operating system.

What if we were to avoid sleeping altogether? This technique is known as ā€œbusy-waitingā€ and you can perform this by commenting out the Sleep(1) command in the game loop above. This will spin the cpu until the desired time producing very stable results but at the cost of 100% cpu usage. I would not release a game that used 100% cpu usage as it is just not good practice to burn cpu, especially if gamers are playing on a laptop with a limited charge. It can be useful as a test case though or to help with debugging.

So, what if there was a way to wait until the next frame should be drawn without burning 100% cpu? Well there is a way to do that in cocos2d-x, and it is by using VSync. The way VSync is implemented is handled by GLFW, which is the graphical window library cocos2d-x uses to draw itā€™s windows/images to the screen. GLFW has a method called glfwSwapBuffers which swaps the back/rendering buffer to the front/display buffer, and the engine calls this once every game loop in void Director::drawScene().

If VSync is enabled by your graphics driver, then SwapBuffers will block (without using 100% cpu) until the next vertical retrace of the monitor, and swap when it is safe to avoid tearing. By default the swap interval is set by glfwSwapInterval() to 1, which is waiting until the next monitor refresh to draw your frame(so in effect VSync is on by default in Cocos2d-x). I wrote methods ,which I will post further down the document, to change the SwapInterval although it appeared to result in only a binary implementation where 0 was off and anything over 1 was clamped to 1, presumably by the graphics driver.

NOTE: VSYNC CAN BE ENABLED OR DISABLED BY THE USER!!! If the user has the option of VSync set to OFF in their graphics card configuration program then swapBuffers will not block and your simulation will run at max speed, or 1000ā€™s of frames per second. You cannot rely on VSync being enabled by your users so the game loop code to check whether or not the time is past the animation interval should still be used.
For more info see http://www.glfw.org/docs/latest/window.html

To add the method to change the swap interval, in cocos\platform\CCGLView.h add

/**Sets the number of frames to wait before swapping buffers.
0 = no vsync, 1 = vsync on */
/EXPERIMENTAL/
virtual void setSwapInterval(int interval) = 0;

in cocos\platform\desktop\CCGLViewImpl-desktop.h add

virtual void setSwapInterval(int interval) override;

in cocos\platform\desktop\CCGLViewImpl-desktop.cpp add

void GLViewImpl::setSwapInterval(int interval)
{
if(_mainWindow)
glfwSwapInterval(interval);
}

If we cannot rely on VSync being on, there is another issue that we face. Before the game loop is started, QueryPerformanceCounter essentially grabs the current time and marks it as the starting point of the game loop. All subsequent renders are based on this timestamp plus the animation interval. What could happen, is that the timestamp could be close to the monitorsā€™ vertical retrace event. This would be really bad as the frame renderings would ā€˜dance aroundā€™ the monitors refresh cycle and could be as bad as missing/doubling every other frame. This image illustrates the issue.

This image is from
https://software.intel.com/en-us/articles/video-frame-display-synchronization

I could not find a method in GLFW to detect when the monitor was entering vertical retrace. There is an API which has this ability and it isā€¦ Microsoftā€™s own DirectDraw API. In it there are some methods, WaitForVerticalBlank, GetScanLine, and GetVerticalBlankStatus that could be of use.

In this example, I initialize DirectDraw, and waitForVerticalBlank to end, and then query the performance counter to set the timestamp so that the cycles are aligned. This is done right before entering the game loop in CCApplication-win32.cpp

LPDIRECTDRAW g_pDD = NULL; // The DirectDraw object
HRESULT hr;
// Initialize DirectDraw
hr = DirectDrawCreate(NULL, &g_pDD, NULL);
if (DD_OK ==g_pDD->WaitForVerticalBlank(DDWAITVB_BLOCKEND, NULL))
{
OutputDebugString(TEXT(ā€œWaited forVBlankā€));
}
QueryPerformanceCounter(&nLast);
// The DirectDraw object is no longer needed
g_pDD->Release();
g_pDD= NULL;

To do this you need to include ā€œddraw.hā€ and link to link to ddraw.lib. I found mine in C:\Program Files (x86)\Windows Kits\8.1\Lib\winv6.3\um\x86\ddraw.lib

There are potentially some uses with getScanLine that could be used in an adaptive synchronization situation which could account for a case if someone turns off their monitor and the cycles need to be realigned. As you can see with refresh rates synchronization it is difficult to find a one size fits all solution.

The most insidious cause of jittering/jerky performance is the use of ā€œmonitorsā€ using television timings. If you are using a TV as your monitor over hdmi, (I am) you might be running at 59.94Hz instead of 60hz. If the games animation interval is set to 60fps there will be a drift in the synchronization and can produce some ā€˜beatingā€™, out of phase display, or at the very least drop a frame every 10 seconds or so. This can be tough to detect as your tv shows 60hz in the display panel, and your OS is set to 60hz. You can try using toastyā€™s custom resolution tool http://www.monitortests.com/forum/Thread-Custom-Resolution-Utility-CRU?page=1 to try to set your timings to 60hz and not 59.94

I wrote some methods to utilize GLFW to detect your monitors refresh rate to see if this is happening.

In cocos\platform\CCGLView.h add

/** gets the current refresh rate of the monitor */
/EXPERIMENTAL/
virtual int getRunningRefreshRate() = 0;

In cocos\platform\desktop\CCGLViewImpl-desktop.h add

virtual int getRunningRefreshRate() override;

In cocos\platform\desktop\CCGLViewImpl-desktop.cpp add

int GLViewImpl::getRunningRefreshRate()
{
return glfwGetVideoMode(glfwGetPrimaryMonitor())->refreshRate;
}

In your update method, or anywhere in your program you can use
CCLOG(std::to_string(Director::getInstance()->getOpenGLView()->getRunningRefreshRate()).c_str()); to test your monitors refresh rate. The GLFW structure unfortunately returns an int for the monitors refresh rate so it will return 59 if it is set to 59.94, and 60 if it is set to 60. You may be surprised to see 59 being returned, if this is the case you can set the animation interval to 1/59.94f. It could be useful to add a check in your games initialization that if (Director::getInstance()->getOpenGLView()->getRunningRefreshRate() == 59) or 29, or 119 for that matter, then set the animation interval accordingly.

These are some of the tools that I used to troubleshoot display issues. This is a complex issue not just a simple copy and paste job, so try to understand whatā€™s going on under the hood if you use any of this code.

In my case, I added these to cocos\base\ccConfig.h to help
with testing so I could just set values to 1 or 0 for on or off.

/** Use VSYNC EXPERIMENTAL FOR WIN32*/
'#ifndef CC_USE_VSYNC
'#define CC_USE_VSYNC 1
'#endif

/** Change Windows timer
resolution to 1 ms** */
'#ifndef CC_WIN_TIMER1
'#define CC_WIN_TIMER1 1
'#endif

/** Sleep for 1 millisecond when waiting to draw frame. If not enabled game loop will busy wait full CPU usage WARNING */
'#ifndef CC_SLEEP_1_MSEC
'#define CC_SLEEP_1_MSEC 1
'#endif

And in the cocos\platform\win32\CCApplication-win32.cpp
I check for these defines so my run() method looks something like

int Application::run()
{
PVRFrameEnableControlWindow(false);
'#if CC_WIN_TIMER1 == 1
UINT TARGET_RESOLUTION = 1; // 1-millisecond target resolution
TIMECAPS tc;
UINT wTimerRes;
if (timeGetDevCaps(&tc, sizeof(TIMECAPS)) != TIMERR_NOERROR)
{
// Error; application canā€™t continue.
}
wTimerRes = std::min(std::max(tc.wPeriodMin, TARGET_RESOLUTION), tc.wPeriodMax);
timeBeginPeriod(wTimerRes);
'#endif

(ā€¦ā€¦)

'#if CC_USE_VSYNC == 1
glview->setSwapInterval(1);
'#endif

while(!glview->windowShouldClose())
{
QueryPerformanceCounter(&nNow);
if (nNow.QuadPart - nLast.QuadPart > _animationInterval.QuadPart)
{
nLast.QuadPart = nNow.QuadPart - (nNow.QuadPart %
_animationInterval.QuadPart);
director->mainLoop();
//glview->pollEvents();
}
else
{
'#if CC_SLEEP_1_MSEC == 1
Sleep(1);
'#endif
}
}

// Director should
still do a cleanup if the window was closed manually.
if (glview->isOpenGLReady())
{
director->end();
director->mainLoop();
director = nullptr;
}
glview->release();
'#if CC_WIN_TIMER1 == 1
wTimerRes = std::min(std::max(tc.wPeriodMin, TARGET_RESOLUTION), tc.wPeriodMax);
timeEndPeriod(wTimerRes);
'#endif

return 0;
}

Feel free to post any questions you may have and please post your results if this helped you in any way.

21 Likes

Excellent analysis of cocos2d-x win32 stuttering issues! Iā€™ve been wondering about this for months.

Will try this out in my game and post any findings.

Thanks for putting so much time and effort into this! Iā€™m not working on a Windows project at the moment, but Iā€™ve bookmarked this post to so I can come back to it when I do :smile:

In the past I noticed playing sound effects was causing a lot of stuttering on Win32 builds. Windows 10 universal apps worked perfectly, and when I removed sound from my game it ran smooth on Win32 too. I ended up using SDL to handle my audio on Windows and it dramatically improved performance.

1 Like

Waiting for new cocos2d-x version which will fix all these bugs. Cocos2d-x developers someday should pay attention to thisā€¦ Because win32 is important platfrom when your mobile game became very popular and it needs to go in steam :stuck_out_tongue_winking_eye:

Likewise, I use FMOD for sound on win32 and this fixes a dramatic slowdown bug for some players.

This is great thanks KyleK, I used your code to manually set the timer interval and itā€™s made a huge difference, the stuttering issue has disappeared :smile:

Going forward it sounds like the cocos2d-x devs could have a look at improving their code with regards to this though.

This is a fantastically comprehensive write up. Thanks for the time youā€™ve put into sharing this research, Iā€™m sure itā€™ll help quite a few people.

Iā€™d love to see some cocos2d-x dev responses here with a view to integrate your findings into future engine versions.

+2 cool points

Best post I have ever read hereā€¦ how come nobody else noticed this before!