[PATCH 13/18] ntdll: Use syscall frame for YMM context in x86_64 NtGetContextThread.

Jacek Caban jacek at codeweavers.com
Mon Jan 25 13:05:16 CST 2021


On 24.01.2021 11:54, Stefan Dösinger wrote:
> Am Samstag, 23. Jänner 2021, 14:40:08 EAT schrieb Jacek Caban:
>> On 22/01/2021 17:21, Paul Gofman wrote:
>>> I think we still support processors which don't have AVX and thus don't
>>> have xsave instruction (which is reported as a separate cpuid bit).
>> Looking closer at this, we indeed need a feature check here. I will work
>> on a new version (patches 1-11 in the series should not be affected).
> I am curious if you have tested the performance impact of this. While I agree
> that this chance is the right thing to do it would be nice to know the
> downsides.


It's not exactly clear to me what results you'd like to see. This is a 
similar operation that Windows has to do in its syscalls, so real 
applications already take that into account and avoid unneeded syscalls 
on hot paths. That leaves us with micro benchmarks. I came out with the 
attached benchmark, which tries to show the impact on three types of Nt* 
functions in Wine. It calls NtQueryInformationProcess with different 
arguments. Depending on the argument:

- ProcessIoCounters: Wine quickly returns some data. This is a typical 
thing that stubs do, but some implemented functions are like that as well.

- ProcessVmCounters: Wine does some stuff on client side, including 
Linux syscalls, to do its work.

- ProcessBasicInformation: Wine uses a server call to implement it.


Here are my averaged results of a few runs, but I really don't want to 
read too much out of it. I originally planned to send result of a random 
run, but it showed that patched Wine is notably faster on server calls, 
so the variation was higher than the impact:

Current Wine:    310    17692    4748

Patched Wine:    2910    18243    4898


For the patched version, I used my local tree which has this series with 
additional runtime cpuid checks to use fxsave/xsavec/xsave depending on 
CPU capabilities. As expected, the impact on plain stub call is large, 
but compared to a real load the the impact seems marginal.


For comparison, Windows results are something like 2200, 140, 140.


Jacek

-------------- next part --------------
A non-text attachment was scrubbed...
Name: syscallbench.c
Type: text/x-csrc
Size: 1333 bytes
Desc: not available
URL: <http://www.winehq.org/pipermail/wine-devel/attachments/20210125/3bd9b38c/attachment.c>


More information about the wine-devel mailing list