OPEN TASKS1. rewrite Renderer::renderWater() and Renderer::renderSurface() using Vertex Arrays.
2. More sophisticated state management
3. Sort objects/units before rendering, to minimise calls to glBindTexture()
4. Smarter texture management
5. Use all available texture units
1. Renderer::renderWater() and Renderer::renderSurface()These functions are currently implemented with immediate mode OpenGL. Immediate mode makes lots of function calls and is very inefficient, the more tiles you render (the further back you zoom) the more pronounced this inefficiency. I'll be reorganising the map data soon to remove duplication of information between Map and Pathfinder, at this time I will also split out the data that the renderer is interested in, and put it all in neat compact video card friendly arrays.
The immediate mode code then needs to be gutted. Instead of drawing the quads in the while loop, we build a vector of indices to the vertex array instead, then when our array (err, vector) is built, we set up our vertex, normal, colour & texture arrays and call,
glDrawElements( GL_QUADS, 4 * numTiles, GL_UNSIGNED_SHORT, ourIndices.c_ptr() );Thus replacing hundreds of function calls (potentially thousands) with one.
Note that this task may require elements of task 4, specifically to do with the tileset textures... but I'm not too sure at this stage what the current TextureManager does do for us.
This is what I would be doing with my level of knowledge of, and experience with, OpenGL. If any OpenGL Gurus happen to wander through this way and would like to rewrite this using even fancier VBOs or whatever new fandangled things are available, please do so!
2. What's your state?State management in OpenGL is important, in particular, changing state variables on the 'server' can be expensive. We need to minimise state changes if possible, or at the very least keep track of the state ourselves, and not change it to the same thing. I'm not sure if modern implementations are smart enough to do this for you, but I know this has historically been an issue, so we should do it ourselves in either case.
An assessment of the use of glPushAttrib()/glPopAttrib() should be performed here too, with the aim of minimising if possible.
Ordering our rendering so we push and pop as little as possible would be the goal, and if only a couple of states are changed for something, apparently its better to just change them individually and restore them when you're done, 'by hand'.
Trying to group renderables that need the same state would be desirable, but this would conflict with the aims of task 3.
3. Form on orderly queue! By team colour and then unit type please.Renderer::renderObjects() was where this all started, I was doing some profiling and noticed it was eating up a rather sizeable chunk of the 3D rendering time (about 40% on my system). A attempted quick-fix and some more investigating revealed the problem, glBindTexture(). Some of the default tileset textures are using 5 meshes, all with different textures, object rendering uses one texture unit, causing 5 calls to glBendTexture() for each one rendered.
Before I discovered that some of the objects have so many textures, I tried to fix it by sorting the objects, and then rendering them in order...
So the old while loop that plucked out objects and rendered them becomes a preprocessing loop:
vector<Vec2i> toRender;
PosQuadIterator pqi(visibleQuad, Map::cellScale);
// find all renderable objects...
while(pqi.next()){
const Vec2i &pos= pqi.getPos();
if(map->isInside(pos)){
Tile *t= map->getTile(Map::toTileCoords(pos));
Object *o= t->getObject();
if(o && t->isExplored(thisTeamIndex)){
toRender.push_back ( pos );
}
}
}
// sort them by model
std::map<const void*, vector<Vec2i>> renderTable;
for ( vector<Vec2i>::const_iterator it = toRender.begin(); it != toRender.end(); ++it ) {
Object *o = map->getTile( Map::toTileCoords( *it ) )->getObject();
assert( o && o->getModel() );
renderTable[o->getModel()].push_back( *it );
}
then the rendering iterates over the renderTable map, each element is a vector of positions of objects with the same model. This was meant to reduce the calls to glBindTexture() because the code does actually check if the currently bound texture is the same as the new one, and doesn't bind if it is (but it only does this for the base texture unit... more on that in a minute...)
It didn't reduce the calls to glBindTexture(), because some of the models were changing the texture up to five times each...
The possible solutions for objects all involve tasks 4 & 5.
Units we probably can speed up using this method. And by not always binding the team texture, as it currently is. So renderable units need to be sorted first by team colour, then by unit type. Some units do use more than one texture, though 2 seems to be a typical max, and most models seem to use only one, so this should give a noticeable improvement (if there's lots of the same unit type on screen that is!)
We will need to add a 'lastTeamTexture' to MeshCallbackTeamColor, and check before binding textures, much like the model renderer currently does with the base texture unit and 'lastTexture' [see ModelRendererGl::renderMesh()].
4. Texture ManagementI shouldn't say too much here, I'm not that familiar with what our current TextureManager does for us, but I have noticed a few things it doesn't do
It's not grouping small textures into bigger textures, this is a must for tileset textures in order to complete task 1.
SurfaceAtlas seems to provide an interface to do just this, via addSurface(), but then it just creates a new texture for each needed tile texture, rather than grouping them all in one big texture, and setting the texture coordinates for each accordingly.
The same might be possible to overcome the object/unit with multiple textures problem, or...
5. Texture UnitsWe're using one texture unit for all our model rendering... one for the fog of war, and one for shadow mapping (team colour uses the fog of war unit).
That's 3. I'm willing to wager most of the games of Glest played today are on OpenGl implementations offering more than 3 texture units. We should check how many there are, and use them ALL!
Indeed, if sufficient texture units are available, the tileset objects with 5 textures problem is solved, use a texture unit for each, and then my ugly sorting code from above might actually be beneficial.