[PATCH 0/5] wined3d: Rework resource fencing (v2).

Stefan Dösinger stefan at codeweavers.com
Wed Mar 9 04:48:28 CST 2022


Changes in version 2 of this patchset:

*) Rename resource_acquire to resource_reference.
*) Simplify the wrap-around logic (thanks Jan!).

Some updated benchmarking information: Keeping a separate counter for fencing instead of re-using
head and tail has a measurable performance impact in my draw overhead microbenchmark.

In World of Tanks the full 32 bit head/tail numbers wrap around within 3-4 minutes rather than a
few hours as I concluded earlier from my microbenchmarks. In a way this is welcome - the wrap-around
logic is actually used rather than untested dead code. It might make the phantom waits described in
patch 2 a bit more likely though. Should this become an issue I believe we can change head and tail
to SIZE_T or ULONGLONG. I could not measure any performance impact of a 64 bit counter vs a 32 bit
counter. I tested it in 32 bit client on a 64 bit CPU. I don't have a multicore 32 bit CPU available
for testing in a pure 32 bit setup.

This is the patchset described in https://www.winehq.org/pipermail/wine-devel/2022-January/204020.html .
It simplifies and speeds up d3d resource tracking in a few ways:

*) Completely remove any burden on the CS thread.
*) Replace interlocked ops on the client thread with a plain assignment.
*) Piggy-pack onto the queue's head and tail counters, which we already
   increment with interlocked ops.

I tested the impact with a microbenchmark:
https://github.com/stefand/perftest/blob/main/resource_tracking_d3d11/resource_tracking_d3d11.cpp

Depending on the CPU it doubles or tripples draw speed in that microbenchmark. In real games the
effect is much less pronounced, but I do see about a 2% gain in World of Tanks. I also see a gain
in Rocket League, but only if I hack away other known issues with Rocket League (UpdateSubResource
in particular). 

I have further improvements to resource tracking in my mind that can be done on top of these patches:
*) Separate read and write access times.
*) Remove draw and compute tracking for d3d10+ clients and only track staging resources.

Matteo had some ideas to make the queue multi-writer thread safe to further reduce the use of
wined3d_cs. This patchset makes this a bit more complicated because the head value cannot be infered
from the return value of require_space() and thus needs to be passed around separately to submit().
This can be done either with thread local storage or via a separate parameter to require_space() and
submit().

Stefan Dösinger (5):
  wined3d: Use extra bits in the queue head and tail counters.
  wined3d: Use the default queue index for resource fencing.
  wined3d: Remove the no-op wined3d_resource_release.
  wined3d: Remove the resource_acquire call in resource_cleanup.
  wined3d: Rename resource_acquire to resource_reference.

 dlls/wined3d/cs.c              | 276 +++++++++------------------------
 dlls/wined3d/resource.c        |   2 -
 dlls/wined3d/wined3d_private.h |  68 ++++++--
 3 files changed, 123 insertions(+), 223 deletions(-)

-- 
2.34.1




More information about the wine-devel mailing list