WineD3D: WineD3D: Use the shader backend to enable / disable atifs and nvts

Sun Apr 13 14:26:19 CDT 2008

Am Samstag, 12. April 2008 04:27:01 schrieb Ivan Gyurdiev:
> Stefan Dösinger wrote:
> > Alexandre didn't commit the patch, I think we should come to an agreement
> > on this issue, otherwise it is going to come up again and again.
>
> The fundamental issue is pretty straightforward - not sure why it's so
> difficult to come to an agreement.
>
>     - You want to mix and match vertex and fragment GL backends
>     - The only maintainable way to do that is to define an interface
> between vertex and fragment objects
I certainly see the advantages of a constrained interface, I just don't 
see(and still don't see) how it can be designed cleanly without greatly 
limiting functionality of the pipeline / shader implementation.

I discussed the topic with Henri on IRC again(@Henri: Please correct me if I 
missunderstood you), and he explained that his plans consider making GLSL 
vertex shaders, GLSL vertex replacement, GLSL fragment replacement and GLSL 
fragment replacement one object with different interfaces. So we can have 
backchannel communication between the various interface implementations(e.g. 
flags or private data) which keeps everything flexible. It's not precisely 
nice(backchannel communication somewhat defeats the point of interfaces), but 
my design has ugliness as well, so I can live with that.

What's most important to me about this is that we don't have to close any bug 
as WONTFIX due to design constraints. So I can stop feeling strongly against 
splitting up the interface since the implementations remain the same.

Now the main issue with not splitting up the interfaces I see is that it is 
unclear what code in state.c(or the shader backend's state table) changes 
which GL state. If state_something changes both vertex and fragment GL states 
I can't overwrite it properly in the ATIFS/NVRC code without messing with the 
vertex side as well.

I have a few remaining issues though:
-> With splitting up the state table there are now 5 "root" states which have 
to be polled for changes in CTXUSAGE_DRAWPRIM setting, also the 
GL_TEXTURE_SHADER_NV and GL_FRAGMENT_SHADER_ATI states need to be polled for 
enabling/disabling. Is there no way to avoid that?

-> Some cross pipeline part communication issues are still remaining, see 
below

-> Increased state dirtification complexity: Now each Set*State has to find 
out which part of the pipeline it has to dirtify(a switch-case statement or 
probably table referencing), and has to dirtify up to 3 pipelines. That's not 
precicely going to help performance.

I know that Ivan doesn't really care about that, especially since it's not an 
algorithmic complexity change. However, performance is a top priority issue, 
and no gamer will accept the next-gen hardware excuse for inefficient code.

I mainly want to avoid more bad PR like this:
http://www.phoronix.com/scan.php?page=article&item=938
http://www.phoronix.com/scan.php?page=article&item=crossover_games

The 2nd article would be good for me if it was due to a tuneup in cxgames, but 
it is a regression in wine instead. Yes, that's phoronix and all, and the 
first article's regressions are technically perfectly explainable. Still new 
features don't have to come with performance costs. I have a ~5% performance 
regression myself which I don't know where it comes from.

So I propose the following plan:

1) We commit the patch to fix fglrx

2) We keep the shader / ffp interface as it is for Wine 1.0. We freeze in two 
weeks. I am away for one week now, so if anyone wants any shader interface 
changes in 1.0 he'll have to do it himself

3) We investigate where the recent performance regression(s) came from

4) We audit the state handlers in state.c and find out which D3D state handler 
changes which parts of the GL pipeline and find states that touch more than 
one part and why.

5) Build a battle plan how to separate the following D3D states in various GL 
pipeline parts:

-> Vertex shaders - streams. The tricky part here is that the fixed function 
GL vertex pipeline needs named arrays, while an ARB/GLSL vertex pipeline 
needs numbered arrays(otherwise no vertex blending emulation). How do we 
communicate the need for numbered arrays, and the choosen assignment?

-> vertex decl - loaded pointers. We're currently checking the vertex shader 
and fog states when a vertex buffer offset it changed, that is not needed

-> Samplers - GL_TEXTURE_xD enable - colorop. That's a major pain for ATIFS 
and even more NVTS. I haven't found a nicer implementation using split up 
interfaces

-> How do we deal with the depth blit shaders? Do they belong to the shader 
backend, or to something else?

-> Should we move the SetupForBlit to some of the shader code? The blitting 
and state switching might be more efficient if we're using shaders for it and 
just set a ARB / GLSL shader instead of falling back to the absolute low 
level limit and killing all states

-> Texture transform flags, clipping, more?

My ultimate hope is to have a clean assignment for each code referenced by the 
state table to the pipeline part, so instead of splitting up the state table 
in code we have a programming guidelines about which part of the state table 
may be changed by which pipeline replacement to avoid the additional run-time 
costs. If that turns out to be impossible we need a clear assignment anyway 
for a splitup.

6) Investigate the performance implications of the state management, state 
polling, the current conditional state linking state dirtification checks and 
the driver side cost of fixed function state changing while a 

What are your opinions? (btw, I don't think I can implement that stuff alone 
anytime soon, I am pretty busy the next months)

> Even the D3D programmable pipeline is broken up 
> this way (there are Pixel and Vertex shader objects) - and the fixed 
> pipeline is going away, so if anything we should move away from its 
> interface.
One could argue that the name "pixel shader" already shows that Direct3D does 
not separate the pipeline parts properly(GL_NV_texture_shader, issue 1). 
Ironically pixel processing is one of the only remaining parts of Direct3D10 
that is not programmable.