Hello community!
I noticed that orx performance on iPhone 4 is really bad in my game. I have 4 spawners, and at most 20 active objects on the screen. These active objects have textures attached but the textures are all in one big sprite sheet (so i assume opengl shouldn't get crazy about switching contexts too much).
All active objects are scrolling (just by setting Speed to (100, 0, 0) at spawn time).
Why is the FPS so bad? It's around 45 most of the time and goes a little bit up if i zoom out my scene (making even more objects be rendered on screen). Might this be a fill rate issue?
EDIT: actually at a certain zoom level when all objects become relatively small on the screen the fps stays at solid 60. So it only happens when the screen is filled with textures..
From your experience.. did you ever had solid 60 fps in your iPhone games with orx?
Thanks in advance!
Alex
Comments
That sounds pretty low, indeed. With 20 objects you should obtain a solid 60 FPS, at least that's what I get on my iPod 3G.
I assume you're speaking of the release build and not the debug one.
The best way to see what's going on, CPU-wise, is to turn on the profiler (using the profiler build of orx + Render.ShowProfiler set to true).
If it's a fill-rate issue, you'll see all the time spent on the GPU, meaning the longest bar will be the orxRender_Swap() one. Ignore the orxRender_ShowProfiler() one, of course.
Let me know if you find any obvious issue.
So i rebuilt and started with the profiler. I couldn't see any obvious things so I'm attaching some screenshots. I don't see the bar of orxDisplay_Swap change a lot in size even if I disable/enable certain parts of the game. I know that I had solid 60 FPS on just background and on background with two animated characters. In the profile build, of course, the maximum fps is around 35.
Tell me if you see anything interesting in this screenshots:
TIA,
Alex
That also means that the cost doesn't come from context switching or CPU-GPU buffer transfers as it's part of the whole 1.95ms of orxRender_RenderViewport.
That leaves us with something really bad happening on the GPU. I'm afraid you'll have to use the new XCode4 OpenGL profiling tools to see exactly what's going on.
Really quick checks: for your background, are you disabling blending (BlendMode = none) and for any non-transparent object/graphic?
Do you have many overdraw with alpha blending? How big are your textures?
The textures are 512x512 sprite sheets. I have one sprite sheet for each character (for animation) and one 512x512 for all other objects on screen. The background, however, is a separate 320x480 texture.
And no, I have almost zero overdraw.. I'm cutting at the edges of the actual sprite so I always have minimal overdraw.
Am i correct that you are actually loading the big texture (sprite sheet) as GLTexture first and then use that along with texture coords to get the sprite out of the sheet? I mean, you are not creating a GLTexture for each sprite in the sheet, right? Just a guess though..
I will try the Xcode profiler thingy and let you know.
Thanks for your time!
Alex
And yes, you're correct, every bitmap becomes a single texture stored on GPU. We're using UV coordinates for defining sub-objects. Objects are batch rendered (indexed triangle strip) and we sort them by depth, texture, blend mode and smoothing mode first so as to minimize the number of context switching. Vertex buffers are interleaved (vertex coord, UV, color) for better locality.
If an architecture doesn't support non-power of two textures, we create a power of two one that will contain the original texture and update the UV coords accordingly.
Let me know if you find anything with the GL profiler that comes with XCode4 as it's not available with XCode3. I can do some regression testing over the last few months to see if I'm missing something obvious that would have found its way in the codebase, probably next week end.
And no worries for the time, I'm glad if I can help!
I also could answer my own question regarding GLTexture and binding. I could see in the trace that I only have ~7 GLTextures which seems to be correct.
Any ideas?
EDIT: BlendMode didn't help either..
Cheers,
Alex
Yep, 7 textures isn't much at all, there are some internal ones too (screen backup for viewport/screen shader, internal font, pixel).
Mmh, there must be a way to know what the GPU is doing with some time estimation. I thought there was a tool added to XCode4 for that, I'll check that tonight.
Cheers!
EDIT: I actually had to change the Frustum sizes so they match the display size too. If I only change the display size but not the frustum it has no effect and the fps is still bad.
Now the question is WHY???!
HTH,
Alex
So, wait, let's resume it.
Changing the display size on iOS shouldn't do anything as orx will use the devide native resolution and replace the config value of Display.ScreenWidth/Height/Depth by the actual ones.
However your camera frustum matters but I still don't understand why any scaling would kill the FPS this way!
What was your previous values for the camera frustum and what are the ones that work? Is changing the frustum values enough to make the FPS work nicely? If so, I'll investigate what's going on as really, I don't understand, we simply change UV coord values to simulate scaling, that's about it. And it's done on the CPU side.
Also, let me know for the redundant calls, I'm getting curious and it's going to be an itch that I'll need to scratch before being able to get some sleep.
And this is the one which seems to give much better fps:
Note the Zoom on the main camera. This zoom just scales my 320x480 textures to match the 2x size of retina hence making everything look as it was before changing the frustum and display width/height. It's sad but I can't use a static Zoom for the camera because I'm calculating the zoom in code (to zoom the scene as you remember from my other thread on this forum :P ).
I'll gather some info from the profiler and post them in 10 minutes or so.
Cheers!
Alex
Here are two screenshots which will give you an idea:
HTH,
Alex
Ah yeah, those calls. Well, actually they've been added willingly and weren't there at first. It's simply that orx provides rendering hook for people to roll out their own rendering if need be (and that includes 3D rendering) and to make sure that the settings are set correctly when orx does its own rendering.
I'm not exactly sure on how to avoid that as it'd be platform specific and wouldn't fit well in orx's more generic philosophy.
It's not very clean but at least the performance impact should be minimal if any. However I'll still see if I don't find a more elegant way of dealing with this.
As for using VBO, sounds like a generic recommendation given the fact we mainly push quads.
I'm more curious about the mipmapping usage line though. What does it mean in this context? Is there more info when you click it?
I noticed that changing Display settings wont affect anything. You are right. So it's the frustum which makes a big difference on my iPhone4 for some reason.
I took the time (just in the past 45 min) to build my game for android 2.1+ using the non-native version. I ran it on an HTC device (don't know exactly which one but it's really powerful). What do you think? NO problems. 59-60 fps at both 320x480 and 640x960.. it just works. It's pretty ugly, yes.. because it scales a lot to match the native resolution. But it works and gives a decent fps.
Hope to hear any ideas/suggestions soon
Cheers!
Alex
godexsoft wrote:
I sure hope it's that and not detecting some weird mipmaps being computed somewhere for god knows which reason.
That still baffles me. There are virtually no reason why scaling up would kill the framerate that way as the device resolution is the same anyway, it shouldn't matter if texture have been scale up or not. More strange even, by adding a 2X zoom, you're basically doing the same thing the render plugin does when computing the scale factor, so in the end the vertex coords and UV coords should be the exact same one. Actually it's a good start for an investigation, I'll see that either tonight or this week end...
I don't have access to a retina display, unfortunately, but I'll check to see if, by any chance, the simulator shows a similar issue.
That's a good news. It means that it's not a more general OpenGL ES issue with the way orx has been written (the Android display plugin being based on the iOS one, I'll check for any obvious difference).
Same here!
Cheers!
Also, I reread what you posted here's what I think:
First of all, you won't need this one and it's probably for the best if you don't have it as it's going to temper with your natural VSync aligned frames which already gives you 60FPS.
Ok, I have to admit that this one puzzles me. Orx uses the device natural orientation, which is Portrait. That means that the real display size should be:
and not the contrary. When you query the screen size with orxDisplay_GetScreenSize(), what do you get, these numbers (640x960) or rather the contrary (960x640)?
That also means that, with your camera settings, rendering shouldn't fill the viewport as it's going to try to fit the 960 of camera's width into the actual 640 of screen width, which should result in huge letter boxing on screen.
The height of rendered image on screen should then be 640 * (640 / 960) = ~426 pixels out of the actual 960 ones of the physical display (hence adding (960 - 426) / 2 = 267 pixels wide black bars at the top and the bottom of the screen).
I tried it here with the simulator, and that's exactly what I'm obtaining, but somehow this is not what you're getting, isn't it?
My question is: why/how did you get orx to consider the native device being Landscape instead of Portrait? Adding something like UIInterfaceOrientation = UIInterfaceOrientationLandscapeLeft/Right shouldn't affect the way orx perceive the physical device.
I'm curious.
Thanks,
Alex
That still doesn't explain the FPS cut you experienced with a 320x480 camera though. Without access to a physical retina device on my side, I think it's going to be hard for me to find the actual problem.
Also, I just checked in a proper NPOT texture support detection for iOS, it was broken as we were asking for the GL extensions in the thread before creating the thread context and the returned string was null. It shouldn't matter in your case as you're using POT textures, but still.
I tried TapMania on my iPad. It was really nice, too bad there wasn't a native iPad display support!
I'm glad you liked tapmania. There is no iPad support because I never had an iPad back in the days of tapmania development
Tapmania is opensource (terrible code tho) too: sources
Cheers!
Alex