wined3d performance patches

Stefan Dösinger stefandoesinger at gmx.at
Tue Jun 14 08:26:58 CDT 2011


Hi,

Thanks for the comments, I've some replies inlined below. The bigger concern I 
have is that the patches don't improvement by a lot yet, about 1% on AMD GPUs 
and 1.5% to 2.5% on Nvidia GPUs(in real apps, my benchmarks are a different 
issue). Because of that my plan is to do more testing and experiment with 
things like separating the vdecl-streamsrc-vshader states et cetera and see 
how much impact this has.

All in all I am not yet convinced that my approach with lots of very targeted 
tests is the way to salvation, and I want to wait a bit before I start 
screwing up the code. Unfortunately it is still the best way I have.

I'm sending in some of the patches that make sense even without performance 
improvements. The MINFILTER stuff is a case like that, but I want to test it 
on pre-dx10 GPUs first.

On Tuesday 14 June 2011 14:20:12 Henri Verbeet wrote:
> I only looked over about half of this, doesn't look too crazy. You do
> have trailing spaces in a couple of places though. Here are some
Ya, I spotted them myself a while later, and didn't look for style issues 
before sending them to wine-devel. Though maybe I should look for better tools 
to watch out for slopiness like this.

> As far as I'm concerned you can just submit this. I was going to do
> this myself, looks like you got there first.
Still didn't get around to test this on geforce 7 GPUs. It's possible that the 
bug this was supposed to fix is still around.

Besides, it is probably not necessary for the other patches. The consideration 
was that we'd have to verify the filter each draw, but I don't think setting a 
texture as sampler and render target simultaneously is allowed in d3d.

> > @@ -1913,6 +1928,10 @@ void surface_set_texture_name(struct
> > ...
> > +    if (surface_is_framebuffer(surface))
> > +    {
> > +        IWineD3DDeviceImpl_MarkStateDirty(surface->resource.device,
> > STATE_FRAMEBUFFER); +    }
> > ...
> What are these for?
The texture name one is not needed, I've removed that from the patch already. 
The allocate_surface check is needed in case a ddraw app changes the 
pixelformat via SetSurfaceDesc.

> You may also have to handle an active RT getting unloaded, though I'm
> not entirely sure if that's allowed or not.
It shouldn't be. RTs must be in the default pool, which can't be unloaded. Of 
course that doesn't stop our broken location management from screwing up, but 
that would be a separate issue.

> > Subject: [PATCH 09/14] wined3d: Make render target dirtification in draws
> > cheaper
> I've been thinking about getting rid of the texture == drawable stuff
> for FBOs, and instead just making texture <-> drawable loads a nop in
> load_location(). At the very least this would allow the check to
> always be against SFLAG_INDRAWABLE instead of
> surface->rt_location_flags, which IMHO looks more sane.
Yes, this sounds much better.

> I wonder if
> the speedup is mostly for load_location(), modify_location() or both
> though? Maybe we can improve those functions themselves.
It's caused by both, I think it's plain call overhead. I'll double check that 
though. It may also be hyper-sensitivity of the draw overhead test. 260->270 
fps isn't a lot when you consider that native gets ~1100 fps. But right now I 
have to take what I can get.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: This is a digitally signed message part.
URL: <http://www.winehq.org/pipermail/wine-devel/attachments/20110614/caa68539/attachment.pgp>


More information about the wine-devel mailing list