[PATCH 13/18] ntdll: Use syscall frame for YMM context in x86_64 NtGetContextThread.

Jacek Caban jacek at codeweavers.com
Fri Jan 22 11:18:25 CST 2021


Hi Paul,

On 22.01.2021 17:21, Paul Gofman wrote:
> On 1/22/21 18:51, Jacek Caban wrote:
>> Signed-off-by: Jacek Caban <jacek at codeweavers.com>
>> ---
>>   dlls/ntdll/unix/signal_x86_64.c | 32 ++++++++++++++++++++++++++------
>>   1 file changed, 26 insertions(+), 6 deletions(-)
>>
>>
> This (together with saving all the basice and XMM registers) looks like
> a big overhead on every Nt function call. Is it maybe possible to do
> that when explicitly requested only (some option)?


If you mean an user configurable option, I do not think that's the right 
direction (just like any BreakXXXFeature option). I'd rather optimize 
this solution to make sure that its performance is acceptable. This 
series is what I considered good enough for the first iteration.


We may want an extension enabling debugger to get a context inside a 
syscall. I've been thinking about a flag in PEB that winedbg could set 
when desired.


> I think we still support processors which don't have AVX and thus don't
> have xsave instruction (which is reported as a separate cpuid bit).


xsave is part of SSE2, not AVX, and it should ignore unsupported 
requested features, so the patch should be fine as is on hardware 
without AVX. xsave needs, however, to be enabled by OS, so we may need a 
feature check if we want to support OSes without xsave enabled.


> Also, to save at least this part, it is possible to use xsavec which
> won't be saving anything (aside from the mask) if the ymm high part is
> zero (that is, in initial state, which is quite the common case when ymm
> regs were not used before the call; compilers even tend to reset higher
> part of ymm when done with them). There is
> user_shared_data->XState.EnabledFeatures which tells if xsave supported
> at all and user_shared_data->XState.CompactionEnabled tells if xsavec is
> available.


If I read documentation right (and my testing confirms that), xsave does 
what you described as well. It will not store high ymm part if it's in 
initial state. The difference between xsave and xsavec is about storage 
format, but that doesn't make a difference here. xsavec is also not 
exactly free: in an addition to feature check, it also requires entire 
xsave header to be initialized.


What seems to be more interesting is xsaveopt, which I think could make 
a difference. That would, however, need xsave are to be at constant 
address. I've been thinking about storing it next to TEB, but we can't 
do that as long as winsock is called on signal stack, so I left 
experimenting with it for the future.


Thanks,

Jacek




More information about the wine-devel mailing list