D3D performance debugging report
Stefan Dösinger
stefan at codeweavers.com
Wed Apr 27 12:27:17 CDT 2011
Hi,
I spent a few hours debugging wined3d performance today. No, I found no magic
fix for the slowness, just some semi-usable data.
First I wrote a hacky patch to avoid redundant FBO applications. This gave a
tiny, tiny performance increase, see http://www.winehq.org/pipermail/wine-
devel/2011-April/089832.html.
The main investigation concerned redundant shader applications. The aim was to
find out how many of our glBindProgramARB calls are re-binding the same
program, and how much this costs. Depending on the game between 20% to 90% of
all BindProgram calls are redundant. I'll attach my debug hack so others can
test their own apps. I used ARB shaders for testing because they can apply
vertex and fragment programs separately.
This brings up two questions:
(a) How much does this cost
(b) Why does this happen
The costs: In my draw overhead tester hacking out the redundant apply calls
improved performance a lot, from about 101 fps to 157 fps. The biggest part of
that are the GL calls. Without them but the remaining shader logic I get 144
fps.
Unfortunately this does not translate to any performance gains in real apps. I
tried to filter out the redundant apply calls in the simplest way possible:
Track the current value per wined3d_context and check before calling
glBindProgramARB. This gave the 144 fps in the draw overhead tester, but no
measurable increase in any other apps(I tested StarCraft 2, HL2, Team Fortress
2, World in Conflict and a few others)
Given the amount of redundant apply calls and the cost of them in the draw
overlay tester I have expected at least some improvement. Certainly not a 50%
performance increase(the draw overlay tester performs no shader changes at all
in the draw loop), but at least a 2-3% gain. So far I have no explanation why
I didn't see that.
But why do those redundant apply calls happen? It seems like the state
dirtification comes all the way from the stream sources and/or vertex
declaration. STREAMSRC is linked to VDECL, which is linked to VERTEXSHADER,
which in turn reapplies the pixel shader. This means redundant vertex and
pixel shader applications. Separating those states will be a major challenge.
The vdecl<->vshader link shouldn't be needed any more, except in rare cases
where GL_ARB_vertex_array_bgra is not supported and the application switches
one attribute from D3DDECL_D3DCOLOR to a non-d3dcolor attribute. If the vertex
shader changes we still have to reparse the vertex declaraion and reapply the
stream sources because the vshader determines the stream numbers. Maybe we can
reduce the number of times this happens by ordering stream usages and indices
to make sure shaders with compatible input get the same stream ordering.
vdecl and streamsrc are pretty related. If the vdecl is changed we have to
reapply the stream sources. The other way around shouldn't cause problems
though. There's no need to reapply every stream except the changed ones and
there's no need to reapply the vertex shader.
The vertex and pixel shader are linked for a few reasons: The shader backend
API offers only a function to set both. Basic GLSL only offers a function to set
both at once(GL_ARB_separate_shader_objects changes that). And even in ARB the
pixel shader input may require some changes in the vertex shader output to get
Shader Model 3.0 varyings right.
The shader backend API can be changed, but it has to be done in a way that
doesn't hurt GLSL without ARB_separate_shader_objects. If we have classic GLSL
we have to keep the link. With ARB we can conditionally reapply the vertex
shader if the ps_input_signature is changed.
To complicate matters there are additional states that affect the shaders, like
fog, textures, clipping. We don't keep track of those dependencies.
So it's a lot of work to clean up these state dependencies and we don't know
how much it'll gain us :-(
Stefan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: shaderdebug.diff
Type: text/x-patch
Size: 8099 bytes
Desc: not available
URL: <http://www.winehq.org/pipermail/wine-devel/attachments/20110427/1c49c883/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: This is a digitally signed message part.
URL: <http://www.winehq.org/pipermail/wine-devel/attachments/20110427/1c49c883/attachment.pgp>
More information about the wine-devel
mailing list