[PATCH 13/18] ntdll: Use syscall frame for YMM context in x86_64 NtGetContextThread.

Paul Gofman pgofman at codeweavers.com
Tue Jan 26 04:31:08 CST 2021


On 1/26/21 13:10, Stefan Dösinger wrote:
>
>> Am 25.01.2021 um 22:05 schrieb Jacek Caban <jacek at codeweavers.com>:
>>
>> It's not exactly clear to me what results you'd like to see. This is a similar operation that Windows has to do in its syscalls, so real applications already take that into account and avoid unneeded syscalls on hot paths. That leaves us with micro benchmarks. I came out with the attached benchmark, which tries to show the impact on three types of Nt* functions in Wine. It calls NtQueryInformationProcess with different arguments. Depending on the argument:
>>
>> - ProcessIoCounters: Wine quickly returns some data. This is a typical thing that stubs do, but some implemented functions are like that as well.
>>
>> - ProcessVmCounters: Wine does some stuff on client side, including Linux syscalls, to do its work.
>>
>> - ProcessBasicInformation: Wine uses a server call to implement it.
>>
>>
>> Here are my averaged results of a few runs, but I really don't want to read too much out of it. I originally planned to send result of a random run, but it showed that patched Wine is notably faster on server calls, so the variation was higher than the impact:
>>
>> Current Wine:    310    17692    4748
>>
>> Patched Wine:    2910    18243    4898
>>
>>
>> For the patched version, I used my local tree which has this series with additional runtime cpuid checks to use fxsave/xsavec/xsave depending on CPU capabilities. As expected, the impact on plain stub call is large, but compared to a real load the the impact seems marginal.
> What I had in mind was running any kind of game benchmark to see if it has a noticeable impact, but I think your microbenchmark largely rules that out - thanks for looking into that. I am getting concerned that we're replacing something that used to be a regular call with a way more complicated process. Though I guess that's OK for ntdll, where applications expect expensive syscalls. We'd have to think twice about applying the same kind of syscall thunks for e.g. GL calls.
>
>> For comparison, Windows results are something like 2200, 140, 140.
> I am surprised Windows makes syscalls cheaper than our original call-based Wine stub. Something seems odd.
>
Our functions (even stubs with debug trace) currently tend to do (often
redundant, but not necessarily) save of xmm non-volatile registers,
which adds up something. Also, I wouldn't be surpised if Windows has
some fast syscall path which dosen't save the full context, like maybe
for syscalls that simply query some information available in memory and
don't expect to change task's state. Given how many syscalls Windows has
and how they are used, I think it would be something natural (if not
must to) have.




More information about the wine-devel mailing list