I have found a blog about a very efficient (they claim) 2d sprite rendering techniques, the link is here
I might not understand the detail of the procedures, in my understanding what they have done is, instead of drawing a quad they have drawn texture mesh to reduce the transparent portion of texture. They also divided their sprites into portions with opaque pixels and transparent pixels. They first drawn portions with opaque pixels first then drawn portions with transparent pixels. I think the author mentioned in a comment that they applied alpha test on the transparent portions, but it was fairly faster technique. I don't understand this part, why it was faster in spite using alpha test as its a lot slower for mobile.
I have found orxDispay_DrawMesh(...) I guess this function overrides a quad with custom vertex list for drawing a sprite, am I write ? Then its cool, but there is no way of batching draw calls for opaque and transparent sprites in orx atm. And without batching this techniques won't be much efficient, even If I try to reduce transparent portions.
What do you think, can this techniques be efficient for all other platforms or its just older iPad devices that has problems with transparent pixels ?
Actually, even without rendering tight opaque meshes (which can become a nightmare for complex sprites), some games try to still minimize the completely transparent portions of their sprite by not rendering quads but convex polygons that fits the sprite's shape more closely.
The optimization doesn't really come from the batching part (which affects mostly the CPU) but directly from the removal of alpha blending (which is pure GPU only).
That being said, orxDisplay_DrawMesh() will still do batching as long as you're not switching the "context" (ie. the blending mode, the smoothing, the shader or the texture).
It's slightly less performant than writing your own mesh rendering function (which Lydesik has been doing for the background on his latest game, if I recall correctly), however it allows to batch together draw calls coming from regular rendering (quads) and meshes.
As for efficiency, I think it'll mostly affect mobile devices, but it's better to actually test that theory. Also note that in the coming generations of mobile GPUs, this is likely to disappear as well as we can see with NVidia's new GPU (Kepler/Project Logan) aiming at both desktop and mobile.
Some other people have pushed this technique a bit further by only doing opaque rendering and using a post-processing AA method instead. But again, it all depends on the kind of sprites one is using in his/her game. I'm actually considering this for the tablet version of our current game, depending on what the result will be. But it's really just an optimization step so that's the last thing I'll be testing.
Thanks for the link anyway, I'm sure it'll be helpful for people who weren't aware of such optimization.
I can use a polygon instead of a quad here, that's good thing, yeah, generating a tight polygon would be tough specially for animated sprites, but this kind of simple sprites can be easily optimized. Yes, as its optimization, its better to be done later.
They have taken the step a bit further by using few algorithms to create mesh or convex polygons automatically. Like the link I gave in the first post, they have also introduced two way rendering by separating their sprites in opaque and transparent portion, though their transparent portions contain very few alpha pixel. First they draw opaque meshes then transparent portions.
They have also talked about fill rates and overdraw, they also demonstrated the performance comparison, which is astonishing, this technique is really great.
One more thing they have mentioned, they did not use any sprite based animation, their animation is node based or skeletal. Even if someone tries to use tight mesh with sprite animations, it would be better to use same mesh info for all the sprites, so some sprite might contain more or less alpha pixel.
One thing I did not get it properly they have talked about is layers, I don't know what did they do with layers. But overall it is a good technique.
On computer it's likely that you get CPU-bound before GPU-bound for 2D projects, though it all depends one one's specific use.
For that the best is always to profile before doing any optimization, especially if they are as advanced and/or intrusive as this one.