WineD3D: WineD3D: Use the shader backend to enable / disable atifs and nvts

Sat Apr 12 08:06:38 CDT 2008

On 12/04/2008, Stefan Dösinger <stefan at codeweavers.com> wrote:
> Am Samstag, 12. April 2008 04:27:07 schrieb H. Verbeet:
>  > Anything that gets ignored when a vertex shader is active gets put in
>  > the vertex states, anything that gets ignored when a fragment shader
>  > is active should be part of the fragment states. Resource loading
>  > would be part of the "other" states.
>
> GL named arrays get ignored when a vertex shader is in use, unless the shader
>  explicitly uses them...
>
It doesn't get ignored, you still do the upload and the data is still
available should the shader choose to use it. Still, I probably
should've phrased it as "functionality that gets replaced by a vertex
/ fragment shader".

>  > Most of the connections you
>  > mention appear to be connections on the D3D side, these would have no
>  > consequences for a separation on the GL side of things. Iow, it's
>  > perfectly valid for a state in the vertex block and a state in the
>  > fragment block to read from the same D3D state.
>
> So if e.g. the vertex declaration is changed you would dirtify many states:
>  -> misc stream sources
>  -> vertex shader(use it or not?)
>  -> Fog
>  -> Fixed function vertex processing matrices(rhw vertices or not)
>  -> texture transforms
>  -> (a few others as well)
>  I have no problem with doing that, changing the vdecl is an expensive business
>  no matter what we do, just asking to make sure I understand what you mean.
>  How do you control which gl states are dirtified by which d3d state? This
>  will depend on the combination of backends you use.
>
I'm not sure about the exact splitup you're using here, but it would
mean potentially dirtifying multiple states, yes. I could imagine it
as simply dirtifying the vertexdeclaration state on both the vertex
and fragment state tables.

>  There are quite a few opengl connections as well, although they work
>  differently.It's more the various interactions between shader extensions. In
>  quite a few cases the fragment processing implementation has to configure the
>  vertex processing correctly to feed it in the right way, and also the other
>  way round.
>
>  For example, to stick to the texture transform flags. Let's consider we're
>  using fixed function D3D vertex processing in whatever GL extension. Now
>  enter fragment processing and D3DTTFF_PROJECTED:
>
>  -> With fixed function GL or a NVTS fixed function replacement we have to make
>  sure that the 4th coordinate is 1.0 to disabe the GL division if
>  TTFF_PROJECTED is not set, and if it is set with TTFF_COUNT3 make sure that
>  the 3rd coord is copied to the 4th
>  -> With GLSL or ARB fixed function replacement we can handle the lack of
>  TTFF_PROJECTED properly, but not COUNT3
>  -> With ATIFS we can handle everything properly in the replacement shader
>  -> With an ARB, GLSL or ATIFS D3D shader we don't need any special texture
>  transform fixups
>  -> With an NVTS D3D shader we have to take care about disabling projected
>  textures in vertex processing again
>
>  That means different fragment processing implementations have different vertex
>  processing requirements. Now you could make that a flag in the fragment
>  processing and pixel shader implementation. You'd need 4
>  flags(nonshader_unprojected, shader_unprojected, nonshader_count3,
>  shader_count3). Are you sure the flags won't grow out of control?
>
You only need two. One to toggle writing 1.0 to the 4th coordinate
when needed, and one to toggle copying the 3th coordinate to the 4th
when needed. It would certainly beat doing an extension check for
every possible backend. Right now we always do the fixup, so in that
respect it would be an improvement as well.

>  Another example is fogging. Fog is overwritten by ARB and GLSL, but not ATIFS
>  and NVTS(as far as I can see). Is fog a vertex or fragment state? How do you
>  share the quite complex fog applying code between the ATIFS, NVTS and GL
>  fixed function implementation if you make it a fragment state?
>
That depends on the fog type. Vertex fog is a vertex state, fragment
fog is a fragment state. Changing the type would obviously have
interactions with both parts of the pipeline. As for applying the
state, there's no reason different implementations can call a common
function in eg. utils.c to calculate things like the fog mode, type,
start, end, etc. One could argue it doesn't belong in state_fog() in
the first place.

>  > From the shader's point of view a pipeline replacement should be
>  > indistinguishable from real fixed function processing, other than that
>  > in case of GLSL you have to link it together into a single program.
>
> PS 3.0 shaders read texcoords and colors from custom varyings, and D3DCOLOR0,
>  1, TEXCOORD0-7 are linked to their predecessors. That way a 3.0 pshader can
>  interact with fixed function vertex processing and rhw drawing.(fixed
>  function vertex processing is broken on ATI on Windows, but I don't want to
>  justify the design with an ATI driver bug. RHW+3.0 is used by e.g. Age of
>  Empires 3).
>
>  Where would you write the TEXCOORD0-7 and D3DCOLOR0 and 1 varyings from a GLSL
>  vertex shader, and where do you read them from in the pixel shader? Keep
>  indirect varying addressing in the pshader in mind.
>
Just to be clear, with "GLSL vertex shader" you mean "GLSL vertex
processing replacement", right? A vertex processing replacement shader
would write to the regular fixed function output, ie gl_FrontColor,
gl_FrontSecondaryColor, gl_TexCoord[], etc. The fragment shader would
read them the same way as it does when paired with fixed function or
pre-3.0 vertex shaders.

>  > It's a concern when mixing different shader backends, but I'm not sure
>  > there's a lot we can do about it there. For fixed function
>  > replacements it shouldn't matter though, because you always write to
>  > the predefined fixed function outputs. In that case linking is only an
>  > issue for GLSL FFP + GLSL shader, which is trivial to implement.
>
> When you're not able to link ARBVP+ATIFS pixel shaders or ATIVP+NVTS then that
>  defeats the point of implementing pixel shaders with them, so we have to do
>  something about them.
>
Actually, I'm not sure what I was thinking here. It should only be a
concern when mixing two SM3.0+ shaders between different backends,
which atifs and nvts are never going to support. The only combination
that would need some kind of linking would be GLSL SM3.0 vertex
shaders with GLSL SM3.0 fragment shaders, but we currently handle that
and it could pretty much continue to keep working the same way. ARBVP
+ ATIFS PS or ATIVP + NVTS PS should work without problem, if you
split up the shader backend. I still think it's premature to worry too
much about that at the moment, since it's a separate issue from how
you split up the state management.

>  From the mail from yesterday:
>
> > The private data between the FFP replacement and the shader backend
>  > can be shared. That means our general GLSL management stuff can see
>  > you've got eg. a FFP vertex shader and a pixel shader and link them
>  > together.
>
> I am using GLSL for pixel shaders. Can I share my private data with a GLSL
>  fragment replacement, or do I have to bother that the fixed function fragment
>  processing might be done using ATIFS? Not using GLSL pshaders + ATIFS was one
>  of the symtoms for you that "the shader backend isn't the right place to
>  implement this".
>
Pixel shaders + fragment processing replacement doesn't make sense.
Either a GLSL vertex processing replacement + GLSL pixel shader or a
GLSL vertex shader + GLSL fragment processing replacement would work
though. The "GLSL pipeline object" would know if it's being used as
vertex and/or fragment replacement and link everything together. In
case atifs is used no linking is required.

Basically you should think of the GLSL private data + vertex
processing replacement + fragment processing replacement + shader
backend as a single object exposing three different interfaces. Call
it a "pipeline object" if you like.

>  > I'm currently not too bothered about Intel cards, although that might
>  > change in the future. Either way, it's certainly possible to create an
>  > ARB implementation, it's more a matter of priority.
>
> They are pretty widespread and are even used in the EEEPC, so I think dealing
>  with these cards will become a priority soon, at least for me. Unfortunately
>  the driver sucks in terms of stability and performance.
>
Are those cards powerful enough to support a fixed function replacement?