wined3d performance patches
Stefan Dösinger
stefandoesinger at gmx.at
Mon Jun 6 10:55:43 CDT 2011
Hi,
This is intended mostly for the other d3d developers, but since we have quite
a number of them now so individual CCs are a lot of work :-)
I attached the patches I currently have in my tree to give an update on what
I've been working on recently. The main aim of those patches is to reduce draw
overhead a bit, thus improving game performance. The patches need some
cleanup, but for that I first need a patch Matteo is working on.
Feedback is welcome. I'm also interested in test results, e.g. if the changes
break a game, or the performance impact. If those patches cause a 5%
performance increase I am happy.
Patches 1-3: Mostly unrelated. I haven't sent them yet because patch 3 breaks
Unigine Heaven, and patches 1 and 2 make little sense without 3.
Patch 4: This removes a hack for a driver bug workaround. I have to do more
testing on my old machines to find out if the bug is really fixed in newer
nvidia drivers.
Patches 5, 6: They keep track of changes to the framebuffer setup so we don't
have to run through the code that figures out which FBO to bind every draw.
Patch 5 gets rid of the ordering assumption. Patch 6 applies the FBO only when
needed.
They aren't ready yet. In patch 6 the FBO may have to be reapplied when the
pixelshader changes. To implement that I need some draw buffer tracking
infrastructure Matteo is working on. Also clears can be integrated. fbo-
clear.diff is a half-baked attempt to do this. I dropped it when I realized I
was duplicating Matteos work. After that I have to double-check that I took
care of all situations where the FBO may have to be updated.
Furthermore, Matteo says that not calling context_apply_draw_buffers every
time framebuffer() is run is a noticeable performance improvement too. Matteo,
did you test this with just patch 0005, or both 0005 and 0006?
Patch 0007: Sampler map optimization, it has a lengthy description in the
patch file
Patch 0008: A tiny fix, it results in a pretty small improvement on OSX. On
Linux+Nvidia it is not noticeable.
Patch 0009: At first I tried to skip the render target dirtification entirely
via a flag in the d3ddevice, but it was pretty tricky and ugly. Just making it
cheaper gets us ~2/3rd of the way too. (Draw overhead tester performance
without this: 259 fps. Complete disabling of the dirtification calls via a
hack: 275. With this patch: 269)
0010: An unrelated cleanup
Patches 11, 12: Preparation for including clears in the fbo dirtification
patches. See fbo-clear.diff.
More work on performance is obviously required, for example
*) Separate vertex declaration, vertex shader and pixel shader states
*) Speed up sampler preloading. This will be easier once we have a tree-like
state structure.
*) Write more tests for other common operations, like clears, blits, shader
changes, texture changes, vertex buffer changes, dynamic resource loading
*) Test our shader's GPU-side execution performance
*) See if we can do something about locking
*) Isolate bottlenecks in the GPU drivers and get them fixed.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: patches.tar.bz2
Type: application/x-bzip-compressed-tar
Size: 13345 bytes
Desc: not available
URL: <http://www.winehq.org/pipermail/wine-devel/attachments/20110606/7f79909e/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: This is a digitally signed message part.
URL: <http://www.winehq.org/pipermail/wine-devel/attachments/20110606/7f79909e/attachment-0001.pgp>
More information about the wine-devel
mailing list