[PATCH 0/5] wined3d: Rework resource fencing (v2).
Stefan Dösinger
stefan at codeweavers.com
Wed Mar 9 04:48:28 CST 2022
Changes in version 2 of this patchset:
*) Rename resource_acquire to resource_reference.
*) Simplify the wrap-around logic (thanks Jan!).
Some updated benchmarking information: Keeping a separate counter for fencing instead of re-using
head and tail has a measurable performance impact in my draw overhead microbenchmark.
In World of Tanks the full 32 bit head/tail numbers wrap around within 3-4 minutes rather than a
few hours as I concluded earlier from my microbenchmarks. In a way this is welcome - the wrap-around
logic is actually used rather than untested dead code. It might make the phantom waits described in
patch 2 a bit more likely though. Should this become an issue I believe we can change head and tail
to SIZE_T or ULONGLONG. I could not measure any performance impact of a 64 bit counter vs a 32 bit
counter. I tested it in 32 bit client on a 64 bit CPU. I don't have a multicore 32 bit CPU available
for testing in a pure 32 bit setup.
This is the patchset described in https://www.winehq.org/pipermail/wine-devel/2022-January/204020.html .
It simplifies and speeds up d3d resource tracking in a few ways:
*) Completely remove any burden on the CS thread.
*) Replace interlocked ops on the client thread with a plain assignment.
*) Piggy-pack onto the queue's head and tail counters, which we already
increment with interlocked ops.
I tested the impact with a microbenchmark:
https://github.com/stefand/perftest/blob/main/resource_tracking_d3d11/resource_tracking_d3d11.cpp
Depending on the CPU it doubles or tripples draw speed in that microbenchmark. In real games the
effect is much less pronounced, but I do see about a 2% gain in World of Tanks. I also see a gain
in Rocket League, but only if I hack away other known issues with Rocket League (UpdateSubResource
in particular).
I have further improvements to resource tracking in my mind that can be done on top of these patches:
*) Separate read and write access times.
*) Remove draw and compute tracking for d3d10+ clients and only track staging resources.
Matteo had some ideas to make the queue multi-writer thread safe to further reduce the use of
wined3d_cs. This patchset makes this a bit more complicated because the head value cannot be infered
from the return value of require_space() and thus needs to be passed around separately to submit().
This can be done either with thread local storage or via a separate parameter to require_space() and
submit().
Stefan Dösinger (5):
wined3d: Use extra bits in the queue head and tail counters.
wined3d: Use the default queue index for resource fencing.
wined3d: Remove the no-op wined3d_resource_release.
wined3d: Remove the resource_acquire call in resource_cleanup.
wined3d: Rename resource_acquire to resource_reference.
dlls/wined3d/cs.c | 276 +++++++++------------------------
dlls/wined3d/resource.c | 2 -
dlls/wined3d/wined3d_private.h | 68 ++++++--
3 files changed, 123 insertions(+), 223 deletions(-)
--
2.34.1
More information about the wine-devel
mailing list