wined3d performance patches

Henri Verbeet hverbeet at
Tue Jun 14 09:15:35 CDT 2011

On 14 June 2011 15:26, Stefan Dösinger <stefandoesinger at> wrote:
>> As far as I'm concerned you can just submit this. I was going to do
>> this myself, looks like you got there first.
> Still didn't get around to test this on geforce 7 GPUs. It's possible that the
> bug this was supposed to fix is still around.
Yes, but I think that by now GF7 GPUs are marginal enough that it's
not worth keeping the code around for. The Steam HW survey for example
reports over 90% D3D10+ cards. Even if it does regress something, I
think it makes more sense to tell people to either file a bug with
NVIDIA for that or help improve the nouveau driver for that card.

> Besides, it is probably not necessary for the other patches. The consideration
> was that we'd have to verify the filter each draw, but I don't think setting a
> texture as sampler and render target simultaneously is allowed in d3d.
I'm not so sure. E.g. the docs for the INTZ format say you can have an
INTZ texture bound as both depth buffer and texture as long as depth
writes are disabled. (This makes some sense, since in that case there
aren't any read/write conflicts.)

>> > @@ -1913,6 +1928,10 @@ void surface_set_texture_name(struct
>> > ...
>> > +    if (surface_is_framebuffer(surface))
>> > +    {
>> > +        IWineD3DDeviceImpl_MarkStateDirty(surface->resource.device,
>> > ...
>> What are these for?
> The texture name one is not needed, I've removed that from the patch already.
> The allocate_surface check is needed in case a ddraw app changes the
> pixelformat via SetSurfaceDesc.
I'm not sure that can actually happen. wined3d_surface_set_format()
insists the format must be WINED3DFMT_UNKNOWN, so it can't be part of
a working FBO entry before the format is changed. That probably also
means clearing the allocation flags there is a bit silly.

>> You may also have to handle an active RT getting unloaded, though I'm
>> not entirely sure if that's allowed or not.
> It shouldn't be. RTs must be in the default pool, which can't be unloaded.
Even in ddraw?

>> I wonder if
>> the speedup is mostly for load_location(), modify_location() or both
>> though? Maybe we can improve those functions themselves.
> It's caused by both, I think it's plain call overhead. I'll double check that
> though. It may also be hyper-sensitivity of the draw overhead test. 260->270
> fps isn't a lot when you consider that native gets ~1100 fps. But right now I
> have to take what I can get.
Maybe surface_load_location() could do an initial location check a bit
earlier. (And some of the code before the current check could also be
removed when we get rid of texture == drawable.) For
surface_modify_location(), the overlay code probably doesn't belong in
there, we could do a similar early check if the flags already match,
and maybe we should split it in two functions.

More information about the wine-devel mailing list