Blur porting according to the tutorial

There is a well known tutorial for Cocos2d-x how to use RenderTexture and 2 pass Gaussian filter to implement a blur. But it is written with Cocos2d-x 3.0 and if you try to use it in Cocos2d-x 3.1+ it will not work. Here is the ported code below:

#include "TextureBlur.h"

USING_NS_CC;


TextureBlur::TextureBlur()
    : m_maxRadius(64)
    , m_weights(m_maxRadius)
    , m_blurProgramState(nullptr)
    , m_renderer(Director::getInstance()->getRenderer())
{
    std::string blurShaderPath = FileUtils::getInstance()->fullPathForFilename("shaders/blur_2_pass.fsh");
    const GLchar* blur_frag = String::createWithContentsOfFile(blurShaderPath.c_str())->getCString();
    auto blurProgram = GLProgram::createWithByteArrays(ccPositionTextureColor_vert, blur_frag);
    m_blurProgramState = GLProgramState::getOrCreateWithGLProgram(blurProgram);
    m_blurProgramState->retain();
}

void TextureBlur::calculateGaussianWeights(const int points)
{
    const float dx = 1.0f / static_cast<float>(points - 1);
    const float sigma = 1.0f / 3.0f;
    const float norm = 1.0f / (sqrtf(2.0f * static_cast<float>(M_PI)) * sigma);
    const float divsigma2 = 0.5f / (sigma * sigma);
    m_weights[0] = 1.0f;
    for (int i = 1; i < points; i++)
    {
        float x = static_cast<float>(i)* dx;
        m_weights[i] = norm * expf(-x * x * divsigma2) * dx;
        m_weights[0] -= 2.0f * m_weights[i];
    }
}

void TextureBlur::updateBlurShaderUniforms(const Size pixelSize, const Vec2 direction, const int radius)
{
    m_blurProgramState->setUniformVec2("pixelSize", pixelSize);
    m_blurProgramState->setUniformVec2("direction", direction);
    m_blurProgramState->setUniformInt("radius", radius);
    m_blurProgramState->setUniformCallback("weights", [=](GLProgram* program, Uniform* uniform){
        glUniform1fv(static_cast<GLint>(uniform->location), static_cast<GLsizei>(radius), static_cast<GLfloat*>(&m_weights[0]));
    });
}

cocos2d::Texture2D* TextureBlur::blur(cocos2d::Texture2D* target, const int radius, const int step /*= 1*/)
{
    assert(target != nullptr && "Null pointer passed as a texture to blur");
    assert(radius <= m_maxRadius && "Blur radius is too big");
    assert(radius > 0 && "Blur radius is too small");
    assert(step <= radius / 2 + 1 && "Step is too big");
    assert(step > 0 && "Step is too small");

    Size textureSize = target->getContentSizeInPixels() / Director::getInstance()->getContentScaleFactor();
    Size pixelSize = Size(float(step) / textureSize.width, float(step) / textureSize.height);
    int radiusWithStep = radius / step;

    calculateGaussianWeights(radiusWithStep);

    Sprite* stepX = Sprite::createWithTexture(target);
    stepX->setPosition(Point(0.5f*textureSize.width, 0.5f*textureSize.height));
    stepX->setFlippedY(true);

    updateBlurShaderUniforms(pixelSize, Vec2(1.0f, 0.0f), radiusWithStep);
    stepX->setGLProgramState(m_blurProgramState);

    RenderTexture* rtX = RenderTexture::create(textureSize.width, textureSize.height);
    rtX->begin();
    stepX->visit();
    rtX->end();

    m_renderer->render();

    Sprite* stepY = Sprite::createWithTexture(rtX->getSprite()->getTexture());
    stepY->setPosition(Point(0.5f*textureSize.width, 0.5f*textureSize.height));
    stepY->setFlippedY(true);

    updateBlurShaderUniforms(pixelSize, Vec2(0.0f, 1.0f), radiusWithStep);
    stepY->setGLProgramState(m_blurProgramState);

    RenderTexture* rtY = RenderTexture::create(textureSize.width, textureSize.height);
    rtY->begin();
    stepY->visit();
    rtY->end();

    m_renderer->render();

    return rtY->getSprite()->getTexture();
}

TextureBlur::~TextureBlur()
{
    m_blurProgramState->release();
}

And the Fragment shader code:

#ifdef GL_ES								                                      
precision lowp float;
#endif										                                      
                                                                                  
varying vec4 v_fragmentColor;				                                      
varying vec2 v_texCoord;					                                      

uniform vec2 pixelSize;
uniform vec2 direction;
uniform int radius;
uniform float weights[64];
                                                                                  
void main() 
{
	gl_FragColor = texture2D(CC_Texture0, v_texCoord)*weights[0];
	for (int i = 1; i < radius; i++) {
		vec2 offset = vec2(float(i)*pixelSize.x*direction.x, float(i)*pixelSize.y*direction.y);
		gl_FragColor += texture2D(CC_Texture0, v_texCoord + offset)*weights[i];
		gl_FragColor += texture2D(CC_Texture0, v_texCoord - offset)*weights[i];
	}

    gl_FragColor = vec4(gl_FragColor.rgb, 1.0);
}							

I needed this for my game when a popup open and it blurs what is in background with an animation so that within say 0.3 seconds blur becomes stronger and stronger (I animate blur radius to become larger). So I capture the screen with RenderTexture, pass the captured texture TextureBlur::blur with appropriate parameters, obtain the blured texture and set in as my popup background. This is being done several times to animate the blur.

All was great, until I have ported this code to iOS from Windows and it appeared that it has serious performance issues. First I thought that it is the shader that is not optimised, then I suspected the shader parameters (blur radius is too large or so?). But further investigation showed that even if I comment the lines that set shader program states to spriteX and spriteY it still drops the FPS on iPhone 5S to 7-9. Obviously the reason is RenderTexture as I have turned off the shaders. I have investigated how the RenderTexture and Texture2D work and come to conclusion that RenderTexture renders in FBO and Texture2D are some references, but the actual texture data is stored in GPU. Which means that it is optimal and should be very fast even if I do this with a big texture and several times is a short time of a period (say 10 times in 0.3 seconds). So I don’t understand what is wrong in the code above? Why do the following part is becoming a bottleneck with commented stepX(Y)->setGLProgramState(m_blurProgramState);?

Size textureSize = target->getContentSizeInPixels() / Director::getInstance()->getContentScaleFactor();
Size pixelSize = Size(float(step) / textureSize.width, float(step) / textureSize.height);
int radiusWithStep = radius / step;

calculateGaussianWeights(radiusWithStep);

Sprite* stepX = Sprite::createWithTexture(target);
stepX->setPosition(Point(0.5f*textureSize.width, 0.5f*textureSize.height));
stepX->setFlippedY(true);

updateBlurShaderUniforms(pixelSize, Vec2(1.0f, 0.0f), radiusWithStep);
//stepX->setGLProgramState(m_blurProgramState);

RenderTexture* rtX = RenderTexture::create(textureSize.width, textureSize.height);
rtX->begin();
stepX->visit();
rtX->end();

m_renderer->render();

Sprite* stepY = Sprite::createWithTexture(rtX->getSprite()->getTexture());
stepY->setPosition(Point(0.5f*textureSize.width, 0.5f*textureSize.height));
stepY->setFlippedY(true);

updateBlurShaderUniforms(pixelSize, Vec2(0.0f, 1.0f), radiusWithStep);
//stepY->setGLProgramState(m_blurProgramState);

RenderTexture* rtY = RenderTexture::create(textureSize.width, textureSize.height);
rtY->begin();
stepY->visit();
rtY->end();

m_renderer->render();

return rtY->getSprite()->getTexture();

I have also to mention, that I have tried to use two different programs - one for vertical and one for horizontal bluring. To avoid calling m_renderer->render(); (I thought I didn’t clearly understand how RenderTexture and Texture2D work it may cause a stall) but this do not help either. I hope for some help to understand why Cocos2d-x should have this kind of serious performance drop. Have I done something wrong or it is an issue in the engine implementation?

2 Likes

Hey, @naghekyan have you found a solution?