ntdll: Use lockfree implementation for get_cached_fd.

Sebastian Lackner sebastian at fds-team.de
Sat May 31 10:28:34 CDT 2014


First the plots! (latest wine git version) :)
https://dl.dropboxusercontent.com/u/21447213/fd-cache-patch.pdf

As Daniel Horn has already discovered the global lock in get_cached_fd()
can have a significant performance impact, especially if

* applications do a lot of small reads/writes
* if multiple threads are involved (-> they block each other)

This patch proposes a possible solution, which solves this issue by
modifying the cache only in a "safe" manner, which doesn't/shouldn't
introduce new race conditions. It would be nice if some more people
could take a look at it, and test, that it doesn't break other stuff.

My original idea was to use interlocked_cmpxchg64, but this had several
disadvantages: We don't have any implementation for PowerPC, moreover at
least on my computer it is still a bit slower than the lockfree idea
proposed in this patch.

Explanation: The basic idea of this patch is that we have a global
counter 'fd_cache_epoch' which is incremented, whenever the file
descriptor cache is updated. get_cached_fd() first reads the counter,
then the fd entry, and afterwards the counter again - if the counter has
changed we know that a second thread has maybe modified the content, and
have to read it again. The memory barrier is there to ensure that
compilers don't optimize away the two reads of 'fd_cache_epoch'. Please
note that:

* the order of the write instructions when updating the content is
important - this way we ensure that threads only see either (0, *) = not
set, or a valid entry.

* I assume that reading/writing up to the pointer size can be done as an
atomic operation (threads either see the old or new value, but no
partial updates). This assumption is also used in a lot other parts of
wine, for example here:
http://source.winehq.org/source/dlls/ntdll/sync.c#L1295
[ It would probably be more safe to introduce special
interlocked_get/interlocked_set macros/functions for that, but not sure
what other people think about that... ]

For those that want to try out the performance difference: You can use
this program for testing: http://ix.io/cIz - This is also the program
which was used to create the plots at the beginning.

---
 dlls/ntdll/server.c |   57
++++++++++++++++++++++++++++++++++++---------------
 1 file changed, 40 insertions(+), 17 deletions(-)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-ntdll-Use-lockfree-implementation-for-get_cached_fd.patch
Type: text/x-patch
Size: 4691 bytes
Desc: not available
URL: <http://www.winehq.org/pipermail/wine-patches/attachments/20140531/f6e1f906/attachment-0001.bin>


More information about the wine-patches mailing list