Hi,
I'm trying to have a color grading shader in my project but when I apply the shader I get about half the frame rate as with no shader. I'm not really sure how I'm supposed to apply the shader in the first place so I tried 2 ways :
-From the config file only, instead of [Viewport] I use [Viewport@Shader] and I have my shader further in the config file.
-or I let my shader in the config file, and in the code I use orxViewport_GetTexture, I apply the texture to an object parented to the camera and I apply the shader written in the config to this object.
Without shader I get 550 fps, if I apply the shader to the viewport in the config I get 210 fps and if code it with an object parented to the camera I get 260 fps.
I believe the frame rate difference between the shader applied to the viewport or the object is because you can deactivate the alpha blending on the object while it's always activated on the viewport (if that makes sense). If you put a white shader with 0.5 alpha on the viewport with the config file you'll get your game mixed with white while if you put the white shader with 0.5 alpha on an object with no blending you'll get a full white object. I've tried to turn off the alpha blending on the viewport with the config and by switching the alpha blending flags in the code but it didn't work. Maybe there's a way to get more fps in orx by switching the viewport alpha blending off, or maybe it has to be on.
The slowdown with the color grading shader is about the same as when I'm using a constant white shader, so the shader itself isn't what slows down the game. I also tried to apply a white shader to several tutorials examples in their config and they also had half the frame rate, I got from something like 1000 to 500 fps.
Here are
profiler screenshots of my project if you can find something useful.
The 1st one is with no shader.
The 2nd one is with the shader applied to the viewport from the config file .
The 3rd one is with the shader on an object with the viewport texture.
The only thing I notice is that orxDisplay_swap gets higher when the shader is there. Also there's the hashtable remove in the last picture. The compression that imgur applied makes some parts hard to read but the red orxRenderObject says 96x. I don't think the other parts that are hard to read are important. There's just 1 more object when I use an object to hold the viewport texture and the shader.
I used the SVN from last week. I can try to clean the project and post an example with code if you want.
So what's the correct way to have a color grading shader ? I feel using [Viewport@Shader] in the config is cleaner but it's more efficient to make an object in front of the camera. And why does it get so slow if I apply a shader ? Would I have to play with turning on and off the rendering of objects ? The problem is I don't really understand what's happening or how the viewport works under the hood. Maybe a solution could be to use orxViewport_GetTexture and make the color adjustments in software but it feels tedious.
Thanks in advance
PS: also a fun/wierd thing is if you make a shader with 0.5 alpha (in order to check if it's the same thing as behind), when you take a screenshot as a png or tga you will have 0.5 transparency or alpha on the picture. What's annoying is if you take png screenshots of the profiler it will be hard to read because pixels will be transparent. It makes sense to have transparency on screenshots but it's 2 different ways to use alpha so it's a bit confusing at first.
Comments
First of all thanks for the profiler screenshot, it's a nice habit and is very helpful for us to determine the exact source of a performance problem.
However in this case, I already know the answer.
The issue is simply that you lose all parallelism when using the content of the viewport as an input your machine has to wait for the whole frame to be rendered, then fetch back all the content to a texture (which is also very costly!) then finally use this as a texture input for a shader applied on a quad that is rendered on top of your viewport.
Best way of fixing it is of course to remove the need of a fullscreen FX.
Cases where it can't be done is when you have distorsion that'll bring pixels of one object on top of another object that is not directly behind it, if that makes sense.
If you really need the screen as input, I suggest using compositing (http://orx-project.org/wiki/en/orx/tutorials/community/iarwain/compositing). You'll still have the cost of waiting for the scene to finish rendering but you won't have the cost of transferring pixels back from the display surface as they'll already be in a texture.
Finally, you can do manual double buffering to hide the implicit synchronization cost. That works like compositing but by using 2 separate intermediate targets/textures.
If one frame you render to the intermediate texture 1, you won't use it as input for a shader this frame but the frame after. Instead you'll use the texture of the previous frame for your shader input.
This can be achieved by having 2 viewports viewing the same thing but enabling only one per frame and alternating. Same for the input texture, when receiving the event for the texture parameter, use the one bound to the target texture of the previous frame, etc... This of course use more video memory but will achieve much higher render performance.
I'd wait till your game is much more advanced before setting anything beside straightforward compositing (or not using the screen as input) as you might end up CPU-bound in some cases, in which cases waiting for the GPU might not be of any big impact on your framerate.
I know it's unrelated but what about the fact that the viewport seems to be always alpha blended ? Couldn't orx run faster if there was a way to turn it off, or is it just the way it works ?
So the storage has an alpha component but it's only for faster pixel packing/unpacking to the detriment of some video memory.
I guess I can give an option for those who would rather use a RGB render target instead (it's already the case on Android, for example).
The blending itself is controlled per primitive (triangle strips) and is set to alpha by default, as we're in 2D where alpha/transparency is needed on most sprites and is not affected by the format of the target, unless the target is re-used as an input, like in your case.
That being said you can change the blending mode, including turning it off, on a per graphic or per object basis.