This is an impressive improvement! Out of curiosity, how large does the texture 
atlas end up being?

I have mostly looked at the actual drawing code (I did notice a merge conflict 
marker in world.cc though, and bunnybot probably did, too ;)). As I already 
said today on IRC, for desktop OpenGL you should really use instancing. You can 
copy the Arguments structure almost 1:1 into a buffer with per-instance data 
and use a simple vertex shader to reduce the CPU load. I'd understand if you 
don't want separate paths for OpenGL ES 2.0 though.

The next-best thing I noticed is using glMapBuffer to avoid an unnecessary 
copy. Right now you fill the vertices_ vector (first "copy"). Then glBufferData 
copies the vertex data into the buffer object (second copy). And finally the 
vertex data gets uploaded to/pulled by the GPU.

Since you know the number of vertices up-front, it would be better to use 
glBufferData(..., NULL, ...) to reserve the required amount of space, then use 
glMapBuffer and write the vertex data directly into the buffer object, thus 
avoiding the second copy. (This would also apply to 

(And now that I've written this, I notice that GLES 2.0 apparently doesn't have 
glMapBuffer... *sigh*)

Third, it would almost certainly be beneficial to use an element array. That 
would reduce the amount of vertex data you need to write by one third, and 
would also reduce the vertex shader load on the GPU by up to one third 
(Widelands almost certainly isn't shader bound, but hey... perhaps it saves 
some power). The element array can even be completely static - it just needs to 
grow as needed during application warm-up.

For the buffer handling, the following is actually best for the kind of 
streaming draw that Widelands is doing:
1. Determine a reasonably large buffer size N.
2. Use a single buffer object, initialized with glBufferData(..., N, NULL, 
GL_STREAM_DRAW);
3. As you go along drawing a frame, fill the buffer object from front to back. 
For all vertex data that you write, use glMapBufferRange to map _exactly_ the 
range that you want to write for the next draw call, then unmap, then do the 
draw call.
4. When the vertex data for the next draw call wouldn't fit in the remaining 
space of the buffer, start from the front, but (and this is VERY important, 
I've seen plenty of applications screw this up) add the 
GL_MAP_INVALIDATE_BUFFER_BIT.

What happens when you do this is that the driver magically maintains a pool of 
underlying buffers, all of size N. Whenever you start from 0 with the 
MAP_INVALIDATE_BUFFER_BIT, the driver switches the underlying buffer to one 
from the pool that is no longer in-flight. As far as I know, all desktop GL 
drivers do this.

Now you see that the patch I sent isn't actually optimal. It would be even 
better to do something like this:

GLsizeiptr bytes = items.size() * sizeof(T);

if (bytes > size_) {
  glBufferData(GL_ARRAY_BUFFER, bytes, items.data(), GL_STREAM_DRAW);
  size_ = bytes;
  filled_ = bytes;
} else {
  GLbitfield access = GL_MAP_WRITE_BIT;

  if (filled_ + bytes > size_) {
    filled_ = 0;
    access |= GL_MAP_INVALIDATE_BUFFER_BIT;
  }

  void *map = glMapBufferRange(GL_ARRAY_BUFFER, filled_, bytes, access);
  memcpy(map, items.data(), bytes);
  glUnmapBuffer(GL_ARRAY_BUFFER);

  filled_ = (filled_ + bytes + 3) & ~3; // for the memcpy a larger alignment 
might be better; the GPU doesn't really care that much as long as you're dword 
aligned
}

This is better because then the underlying buffer is always the same size and 
the re-use mechanism can work more efficiently. But then you need to check for 
support of GL_ARB_map_buffer_range, and apparently GLES 2.0 doesn't have it. 
Also, buffers don't have to be the exact same size to be re-used by the driver, 
so the impact is probably not too bad.

And of course, while all of the above should reduce CPU load, I have no idea by 
how much.
-- 
https://code.launchpad.net/~widelands-dev/widelands/use_image_cache/+merge/282106
Your team Widelands Developers is requested to review the proposed merge of 
lp:~widelands-dev/widelands/use_image_cache into lp:widelands.

_______________________________________________
Mailing list: https://launchpad.net/~widelands-dev
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~widelands-dev
More help   : https://help.launchpad.net/ListHelp

Reply via email to