@dabingnn wrote:
@Delka Thank you for your feedback.
Thank you! I see I have got a lot of attention
Well. I know it is not your bottleneck. At debug time our perfomance was affected mostly by ‘visit’ function at release compilation I have stable 60 fps (Ipad2) and have no idea where dat bottle neck is. But I assume it will be, when our scene will be large enough because I see that culling (reject of adding of render order) is commented in draw function.
As for ideas, well, I understand cocos-x is generic engine for all cases. As for our game, it has MANY static sprites that are changed never. All of them are childs of one sprite. And when it goes about optimization all my brain powers are spend for optimization of our specific case.
What really I can suggest is:
I see your Z ordering logic is based on fixed order of sprites defined by dependency tree an local/glovbal Z order. But withing games such level of eh strict ordering is not needed. What I as gamedeveloper take care about in my games is:
Object is fully/partialy covered by other object:
A by B
B by Z
C by K + Z
K by Z
At that case I can render A and C at any order, also after A+C I can render B and K at any order. after that goes Z. This freedom opens new options for efficient sorting by material to reduce change of materials. At your case I have to think about what atlas have to hold each of texture assuming the order of possible rendering.
Yes, I can make your ordering work takin in account real demands for ordering, but I have to play with Zorder property too much, and that algorythm have to know not the texture, but the atlas of the sprite. That is not so easy.
Also what I am thinking about regarding such games is 2 large enough static pairs of buffers that hold real vertices (Sprite request vertices from that buffers) and somekind of defragmenter in other process. With previous technique and packing all textures to large atlases (also to decrease reconfiguring of render pipeline) may have interesting results. But this is dream
As a sample of such kind of strange kung-fu there was a developers diary many-many years ago from developers of Age of Empire, they have huge issue with memory fragmentation due to large quantity of created/died units. So they redefined operator ‘new’ and used DirectX memory allocation routines (used by DirectX for textures). That functions have heavy algorithm to prevent fragmentation, it helped a lot
As for triangles and user cases. Well, you provide the user case yourself I have to render 3D object. I have to make custom render command, because I have to provide quads to QuadCommand, but I have triangles… so I have made custom command and use: static vertex buffer, shader with MVP and…
You take care about Gpu, default shaders have no any matrices multiply but put a lot of work to Cpu that can be used to make some other usefull staff. Also I see you use dynamic buffers - as for iPhone etc it is good, because mem is shared (am I right?). But as for PC, I assume it will require to send buffer to a video card.
As for bug in GLprogram:
Renderer calls cmd->useMaterial() it calls the _shader->setUniformsForBuiltins();
it calls kmGLGetMatrix(KM_GL_MODELVIEW, &matrixMV);
it gets kmMat4Assign(pOut, modelview_matrix_stack.top);
modelview_matrix_stack.top <==
But when render get’s into the main Rendering loop you processed matrices at:
if (_runningScene)
{
_runningScene->visit();
so what you get from the top of matrices stack is (actually what is is? MV of root?) but whatever it is - it is the same for All rendering time. It simply can’t be the valid MV for each set of vertices.
At Renderer :
memcpy(_quads + _numQuads, cmd->getQuads(), sizeof(V3F_C4B_T2F_Quad) * cmd->getQuadCount());
convertToWorldCoordinates(_quads + _numQuads, cmd->getQuadCount(), cmd->getModelView());
you use cmd->getModelView() that is correct MV of cmd (its Sprite), but when you call material you use top of matrices stack…
To get in working in my case I commented material->use() and wrote
_shaderProgram->use();
_shaderProgram->setUniformsForBuiltins(_modelViewTransform); << ---- I pass real MV matrix. It works like a charm. I have nice and tasty 3D model with static buffers and all computations on Gpu
May be I messed somewhere, may be your Unit Test is not complex enought to reproduce the case. Who knows? But I want peace for all Cheers!