Monday, January 07, 2008

OpenGL and Threads: What's Wrong

So what's wrong with this code? The answer is that OpenGL is asynchronous, so traditional threading models tend to blow up with OpenGL for reasons that are hard to see (because the APIs look synchronous and act close enough to easily forget that they aren't).

Here's the fine print:
  • Calls to OpenGL from the same thread on the same context are executed in-order.
  • Calls from multiple contexts (and multiple threads always means multiple contexts) can execute out of order from the order they were issued.
  • All calls before a flush are executed before all calls after a flush, even across contexts, as long as the renderer is the same. (At least I think this is true.)
In the first example, a worker thread is preparing VBOs (buffers of geometry) and queueing them to a rendering thread that inserts them into the scene graph every now and then. But remember that both the VBO create and use are asynchronous and happen on other threads.

What ends up happening is that the VBO create code is rendered after the VBO draw code, because in-order execution is not guaranteed! This usually causes the driver to complain that the VBO isn't fully built. The solution is to flush after the VBO is created, which forces VBO creation to happen before any future calls.

In the second example, a PBO (buffer of imag data) is filled on the rendering thread, then sent to a worker thread to extract and process. We have the same problem: because we're across threads, the OpenGL extract operation can execute before the fill operation. Once again, flushing after the first thread-op synchronizes us.

Generally speaking, OpenGL isn't a great API for threading. It has very few options for synchronization - glFlush has real (and often quite negative) performance implications and it's a real blunt tool.

There is a 'fence' operation that allows a thread to block until part of the command-stream has
finished but it isn't real useful because:
  • It's vendor specific (all Macs, NV PCs) so it's not everywhere. It's not fun to use two threading models in an app because the threading primitives only exist in one.
  • It's not cross-thread! The fence is not shared between contexts, so we can't block a worker thread based on completion in a rendering thread - we have to block the renderer itself, which sucks.
In X-Plane we work around these things by trying to use deferred renderings, e.g...

while(1)
{
draw_a_lot();
read_old_async_data();
queue_new_async_data();
}

Teh result is that async queries (PBO framebuffer readbacks, occlusion queries) have a full frame before we ask for them, which helps prevent blocking.

1 comment:

  1. Soon X will allow you to bind one GL context to multiple threads, which I think means that you will be able assume ordering, but you have to handle synchronization yourself in other words GL acts like you'd expect rather than biting you :P

    ReplyDelete