Great. I'll try to quickly go over it again when you're done.

I think i found another issue or two:

In DRI2SwapBuffers() you have this code for glXSwapBuffersMscOML handling:

        ...
    } else {
        /* glXSwapBuffersMscOML could have a 0 target_msc, honor it */
        *swap_target = target_msc;
    }

Now while my previous patch was to restrictive and didn't allow to honor a 0 target_msc, this one is too liberal. It would allow a client to schedule many swaps for the same target_msc value, e.g., for target_msc = 1000:

while (1) glXSwapBuffersMscOML(dpy, drawable, 1000, 0, 0);

-> This would cause the DRM at vblank 1000 to deliver many swap- vblank events during the same vblank, which i assume can cause trouble with timestamping of the swaps and probably server responsiveness? In the pageflipped case, you'd schedule one pageflip successfully and then if i understand correctly, on the 2nd call for this vblank the pageflip ioctl would fail with something like EAGAIN or EBUSY, which would cause the ddx to fall back to blits. If i understand correctly from I830DRI2FrameEventHandler -> I830DRI2CopyRegion, blit requests get submitted to the command fifo, drmCommandNone(intel->drmSubFD, DRM_I915_GEM_THROTTLE); gets called, then DRI2SwapComplete gets called with the 'msc' value of delivery of the vblank event - in this case with 1000.

Not sure if the GEM_THROTTLE'ing somehow takes care of such issues - or the fact that the gpu command fifo must be full at some time, but is there something that would prevent many calls to DRI2SwapComplete with exactly the same 'msc' / vblank value and 'ust' timestamp despite the fact that most of the swaps won't happen at that msc but later? Would any of the throttling here - or something like a full fifo - cause the server to stall or get sluggish? I don't know enough about the implementation, but allowing to deliver many identical vblank swap events sounds problematic to me.

Maybe we could change above code to something like:

        ...
    } else {
        /* glXSwapBuffersMscOML could have a 0 target_msc, honor it */
        if (target_msc <= pPriv->last_swap_target && pPriv->swap_interval)
                target_msc = pPriv->last_swap_target + 1;
        *swap_target = target_msc;
    }

This would not constrain a given target_msc to honor the exact swap_interval setting - what my previous -wrong- patch did. For a target_msc of zero etc., i believe this updated target_msc would result in the same behaviour as the current code. It wouldn't push the target_msc into the "far future", but only make sure it doesn't "get stuck" in the past or at a fixed value in the future, basically only ensuring the "only at most one swap per msc increment" requirement of the OML_sync_control spec.

A similar error case that could be kind'a solved by the above would be wrong ordering of swaprequests:

Assume we're at msc 500, last_swap_target something smaller than e.g., 1000:

glutSolidTeapot(....);
glXSwapBuffersMscOML(dpy, drawable, 10000, 0, 0);
glXSwapBuffersMscOML(dpy, drawable, 9000, 0, 0);
glXSwapBuffersMscOML(dpy, drawable, 8000, 0, 0);
glXSwapBuffersMscOML(dpy, drawable, 7000, 0, 0);

If you follow the current code it would cause roughly the following sequence, due to scheduling vblank events for execution at 7000, 8000, 9000, 10000:

First swap with teapot at 7000, not at 10000 as mandated by the spec (which requires execution of requests in order of submission, ie. fifo order).
Successive swaps probably at 8000, 9000, 10000.

With above code it would be the more correct (less wrong?) order 10000, 10001, 10002, 10003.


Another potential related problem is that calls to DRI2ScheduleSwap() while there are already swaps pending do not throttle the client. So this...

while (1) {
        glutSolidTeapot();
        glXSwapBuffers();
}

...is fine, because the drawing in glutSolidTeapot() will require new buffers (DRI2GetBuffers() etc.) and that routine will call DRI2ThrottleClient() if swaps are pending.

This otoh...

while (1) glXSwapBuffers();

... i think won't throttle - at least i didn't find code that would do this. So you'd probably request large number of vblank events in the DRM for swaps at successive vblank's. At some point i assume the kernel would run out of memory for new vblank event structs or the list of pending events inside the DRM would be full -> drmWaitVblank () would fail with ENOMEM -> goto blitfallback; would trigger some kind of behaviour and you'd have some weirdness in display - an intermix of immediate swaps from the blitfallback and "zombie swaps" as the vblank events get scheduled.

As weird as above example looks - swapping without drawing inbetween - it is something that gets e.g., used for a quick & dirty implementation of frame-sequential stereo (with 3d shutter glasses etc.) on systems that don't support OpenGL quadbuffered stereo contexts. So its not just something totally far fetched.

So i think you should have some call to DRI2Throttleclient at the beginning of DRI2ScheduleSwap() as well for more determinstic behaviour?


I thought quite a bit about such effects while looking at your code. I think a really cool and more bullet-proof implementation of the OML_sync_control spec that would allow a client to render multiple frames ahead / have multiple swaps pending without getting into trouble with a few race-conditions in the current code is possible, but it would require a more dynamic scheduling of what happens when. Maybe we can discuss such refinements when your first iteration of the dri2 swap bits is finished and out there.

-mario




On Mar 7, 2010, at 6:10 PM, Jesse Barnes wrote:

On Sun, 7 Mar 2010 08:44:51 +0100
Mario Kleiner <[email protected]> wrote:

On Mar 5, 2010, at 6:50 PM, Jesse Barnes wrote:


Ok pushed fixes for these issues to the repo above. I know you've put a lot of time into this already, but can you take another look to make
sure I got everything?

No problem.

  Note I left the swap interval default at 1
since GLX_MESA_swap_control should let us set it to 0 to get
unthrottled behavior, which I just fixed and tested.


Agreed, that's ok. GLX_MESA_swap_control is a way to do it.

The current state looks almost good, esp. your fixes to
DRI2SwapBuffers for swap_interval == 0 and for target_msc == 0.

But a few more things:

In DRI2SwapBuffers, in the body of ...

     /* Old DDX or no swap interval, just blit */
     if (!ds->ScheduleSwap || !pPriv->swap_interval) {
        ...

... it calls DRI2SwapComplete and passes target_msc as the 'msc'
value of swap completion. Would be probably better to pass a zero
value, just as it is done in the Intel DDX when the swap doesn't
happen at all at the requested target_msc, but rather immediately due
to some problems. If ds->ScheduleSwap isn't available then the 'msc'
is basically undefined and unknown, so zero is a better return value.
In the !pPriv->swap_interval case one could call a *ds->getMSC() to
get a more meaningful return value for 'msc'. Don't know if it's
worth the effort, but a zero return would be better than 'target_msc'
imho.

Right, good idea.  That way software using the new event won't get
bogus values with old DDXes.

We need to add another patch to Mesa. Mesa translates a glXSwapBuffers
() call into a DRI2SwapBuffers() call with target_msc, divisor and
remainder all zero and DRI2SwapBuffers takes this as hint that
glXSwapBuffers behaviour is requested. However these are also legal
arguments to a glXSwapBuffersMscOML(..., ,0,0,0) call, which has to
behave differently than a glXSwapBuffers call  --> In this case,
DRI2SwapBuffers would confuse this with a glXSwapBuffers request and
do the wrong thing.

I'd propose adding this line to the __glXSwapBuffersMscOML() routine
in mesa's /src/glx/x11/glxcmds.c file:

if (target_msc == 0 && divisor == 0 && remainder == 0) remainder = 1;

-> if target_msc and divisor are zero, then the remainder is ignored
inside the DRI implementation, so the actual value of remainder
doesn't matter for swap behaviour, but can be used to disambiguate
glXSwapBuffers vs. glXSwapBuffersMsc...  for a glXSwapBuffersMscOML
(0,0,0) call.

Ok, I'll have to check the behaviors again, but this sounds reasonable.

Your deferred pPriv deletion logic looks now good to me. But if
deletion is deferred then i think almost all "public" functions
inside dri2.c should return 'BadDrawable' not only for pPriv == NULL
but also for pPriv->refCount == 0, e.g., DRI2SwapBuffers,
DRI2GetMSC, ... From the viewpoint of client code, a refCount == 0 is
the same as a destroyed drawable. For all practical purposes it is
already gone. Maybe it would make sense to consolidate the (pPriv ==
NULL || pPriv->refCount == 0) check into a macro or function,
something like DRI2IsDrawableAlive(pPriv); and then call that in most
DRI2xxx functions?

Yeah, that would make things cleaner, I'll do that.

I believe there's also another problem with the pPriv->blockedClient
logic. The implementation only keeps track of blockedClient, but not
for the reason why the client blocked. This can cause trouble in
DRI2WakeClient. Consider this example which i think would cause the
client to hang:

1. Assume current msc is 1000 and there are 2 threads in the client.

// Thread 1: Request swap at target_msc 1010:
2. glXSwapBuffersMscOML(dpy, drawable, 1010, 0, 0);

// Thread 2: Request wait until target_msc 2000:
3. glXWaitForMscOML(dpu, drawable, 2000, 0, 0, ....);

4. Thread 1: Calls XDestroyWindow(dpy, drawable);

Step 2 will schedule a swap for msc = 1010.
Step 3 will pPriv->blockedClient = client and IgnoreClient(client)
the client and schedule a wakeup at msc >= 2000.
The xconnection dpy is now blocked.

Step 4 doesn't execute yet, because the xconnection is blocked.

Now at msc = 1010 -> swap completes -> DRI2SwapComplete ->
DRI2WakeClient -> executes the elseif branch:

...
     } else if (pPriv->target_sbc == -1) {
         if (pPriv->blockedClient)
             AttendClient(pPriv->blockedClient);
         pPriv->blockedClient = NULL;
...

-> this will AttendClient() the client and release pPriv-
blockedClient, i.e., release at msc 1010 although the block was
meant to be released at msc 2000. Thread 2 still waits for a _XReply
before it can return from the glXWaitForMscOML...

Now step 4 can execute due to the "unfrozen" xconnection dpy.
DRI2DestroyDrawable() executes and DRI2FreeDrawable() gets called
becuase there are no pending flips or blockedClients anymore. pPriv
is gone.

At msc = 2000, DRI2WaitMSCComplete() gets called due to the received
vblank event. It finds that pPriv == NULL and returns immediately.

-> ProcDRI2WaitMSCReply() doesn't ever get called and thread 2 hangs
(forever?) waiting for a _XReply.


In short, i think the implementation should keep track of why a
client is blocked in pPriv->blockedClient and only unblock it for the
reason it was blocked, otherwise there are a race conditions.

Yeah, this could also happen without OML I think, if a few swaps were
queued (resulting in a block) and then a SGI_video_sync call occurred.
I'll fix it up.

Thanks,
--
Jesse Barnes, Intel Open Source Technology Center

*********************************************************************
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: [email protected]
office: +49 (0)7071/601-1623
fax:    +49 (0)7071/601-616
www:    http://www.kyb.tuebingen.mpg.de/~kleinerm
*********************************************************************
"For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled."
(Richard Feynman)

_______________________________________________
xorg-devel mailing list
[email protected]
http://lists.x.org/mailman/listinfo/xorg-devel

Reply via email to