D3D performance debugging report
Emanuele Oriani
emaentra at ngi.it
Sun May 1 07:34:53 CDT 2011
Indeed, I've written a spinlock with GCC extension and replaced the
EnterCriticalSection in the x11 drv file.
Apart that the lock has got to be recursive, so I implemented a quick
(but incorrect) recursive spinlock for the purpose of running SC2 and
difference was barely negligible.
The biggest issue imho is that in this case we have to call a
function... it would be great to inline all that code, but again,
probably the best thing is to limit the number of calls.
I can try a spinlock for the BKL-like which is wined3d lock. I hope this
hasn't got to be recursive, right?
I'm asking this because in case of a recursive lock I'm performing an
extra syscall:
static volatile pid_t x11_lock = 0;
static volatile int x11_lock_cnt = 0;
/***********************************************************************
* wine_tsx11_lock (X11DRV.@)
*/
void CDECL wine_tsx11_lock(void)
{
pid_t th_id = syscall(SYS_gettid); // This might be
expensive!
// I
don't like recursive locks for this reason!
while (th_id != __sync_val_compare_and_swap(&x11_lock, 0, th_id));
++x11_lock_cnt;
asm volatile("lfence" ::: "memory");
}
/***********************************************************************
* wine_tsx11_unlock (X11DRV.@)
*/
void CDECL wine_tsx11_unlock(void)
{
if(!--x11_lock_cnt)
x11_lock=0;
asm volatile("sfence" ::: "memory");
}
Please keep in mind this is a test code, but apparently it's working.
Again, performance in case of SC2 isn't that much... but probably should
test better/with other games?
Let me know,
Cheers,
On 01/05/11 09:33, Stefan Dösinger wrote:
> On Saturday 30 April 2011 18:26:04 Emanuele Oriani wrote:
>> Hi Stefan,
>>
>> What do you think about using inline spinlocks (in asm code maybe) to
>> implement locks?
>> Clearly an optimized spinlock would mean different code for different
>> compilers/architectures, but shouldn't it be the best solution?
> I am usually pessimistic about hand-written assembler optimizations. You can
> give it a try, but compilers are pretty clever these days.
>
> I think trying to optimize the lock calls is a more promising way. We can't
> simply drop the ENTER_GL/LEAVE_GL calls, as you found out in SC2. We may be
> able to reduce the number of those calls by moving blocks of opengl calls
> closer together.
>
> There's also the wined3d lock, which is somewhat like the big kernel lock.
> There's room for improvement there as well, if we soften the "you must call
> wined3d under lock" rule. However the wined3d lock is the smaller problem
> compared to the X11 lock.
More information about the wine-devel
mailing list