[PATCH 13/18] ntdll: Use syscall frame for YMM context in x86_64 NtGetContextThread.
Jacek Caban
jacek at codeweavers.com
Mon Jan 25 13:05:16 CST 2021
On 24.01.2021 11:54, Stefan Dösinger wrote:
> Am Samstag, 23. Jänner 2021, 14:40:08 EAT schrieb Jacek Caban:
>> On 22/01/2021 17:21, Paul Gofman wrote:
>>> I think we still support processors which don't have AVX and thus don't
>>> have xsave instruction (which is reported as a separate cpuid bit).
>> Looking closer at this, we indeed need a feature check here. I will work
>> on a new version (patches 1-11 in the series should not be affected).
> I am curious if you have tested the performance impact of this. While I agree
> that this chance is the right thing to do it would be nice to know the
> downsides.
It's not exactly clear to me what results you'd like to see. This is a
similar operation that Windows has to do in its syscalls, so real
applications already take that into account and avoid unneeded syscalls
on hot paths. That leaves us with micro benchmarks. I came out with the
attached benchmark, which tries to show the impact on three types of Nt*
functions in Wine. It calls NtQueryInformationProcess with different
arguments. Depending on the argument:
- ProcessIoCounters: Wine quickly returns some data. This is a typical
thing that stubs do, but some implemented functions are like that as well.
- ProcessVmCounters: Wine does some stuff on client side, including
Linux syscalls, to do its work.
- ProcessBasicInformation: Wine uses a server call to implement it.
Here are my averaged results of a few runs, but I really don't want to
read too much out of it. I originally planned to send result of a random
run, but it showed that patched Wine is notably faster on server calls,
so the variation was higher than the impact:
Current Wine: 310 17692 4748
Patched Wine: 2910 18243 4898
For the patched version, I used my local tree which has this series with
additional runtime cpuid checks to use fxsave/xsavec/xsave depending on
CPU capabilities. As expected, the impact on plain stub call is large,
but compared to a real load the the impact seems marginal.
For comparison, Windows results are something like 2200, 140, 140.
Jacek
-------------- next part --------------
A non-text attachment was scrubbed...
Name: syscallbench.c
Type: text/x-csrc
Size: 1333 bytes
Desc: not available
URL: <http://www.winehq.org/pipermail/wine-devel/attachments/20210125/3bd9b38c/attachment.c>
More information about the wine-devel
mailing list