[PATCH] ntdll: Set rcx and r11 on exit from syscall dispatcher on x64.

Jinoh Kang jinoh.kang.kr at gmail.com
Tue Nov 30 23:34:47 CST 2021


On 12/1/21 08:43, Paul Gofman wrote:
> I am sorry, I actually somehow missed the actual test program thus all of my prior questions.

No worries.

> 
> While you are probably right in your assessment it still seems to me that these aspects are a bit orthogonal to what I am trying to do.

I agree.  In fact I find these results surprising too, which I had no prior knowledge.

> Wine does not handle page faults, host OS does that directly thus we can't track the influence of that on the context,

Yes, we can't achieve that without severe performance penalties.

> nor I know any application which depends on that.

Hopefully...

> That suggests though that trying to simulate r11 the way my patch does probably doesn't make much sense.
> 
> I think I will resend the patch without touching r11 at all.

Thanks!

> 
> Then, if I read your test correctly, WRT rcx handling, you show that the value in rcx stored in the context is not the one my test shows on syscall exit (the rcx in the context is supposed to be the first argument to syscall, right?).

Yes, KiFastSystemCall seems to internally reverse the effect of "mov r10, rcx"
in the ntdll stub prologue (presumably for compatibility with INT 2Eh).

> But my patch doesn't touch the rcx value in the context (and your test confirms that it should not).

You're right -- it shouldn't.

My speculation is that there's a flag that determines whether it's OK to clobber
RCX/R11 on syscall exit.  If it's enabled, KiFastSystemCall will use SYSRET
instead of IRETQ.  Issuing NtSetContextThread with CONTEXT_INTEGER on supposedly
turns this flag off, disabling the use of SYSRET.  From the observations so far,
this flag more or less corresponds to CONTEXT_CONTROL in
syscall_frame::restore_flags, but more testing is required...

> 
> 
> On 11/30/21 20:22, Paul Gofman wrote:
>> Thanks for testing this. First of all, I did not try to simulate every bit of volatile registers clobbering performed by the kernel. Maybe my mistake here is rather touching r11 at all (while I mostly care about rcx now) or at least making those changes in a single patch. Yet your analysis might be quite helpful, although I can't say I could follow completely. My questions are inline.
>>
>> On 11/30/21 19:56, Jinoh Kang wrote:
>>> After some testing I found out that the patch's behaviour is inaccurate.
>>>
>>> The attached test program does the following:
>>>
>>> 1. Set R10 to 0xdeadbeef5a5a5a5a and R11 to 0x0123456789ABCDEF.
>>> 2. Generate a page fault.
>> By page fault do you mean exactly page fault per se (transparently handled by kernel) or access violation? If that's the first not sure what exactly is it supposed to influence?
>>> 3. Set R10 to 0xcafebabea5a5a5a5 and R11 to 0xfedcba9876543210.
>> Set where? If p. 2., concerns access violoation, do you mean in vectored handle? Or maybe I am not following completely.
>>> 4. Issue a system call that pauses the current thread.
>>
>> Which exactly do you mean, to be sure?
>>
>>> 5. Switch to another thread, and dump the previous thread's registers.
>>> 6. Set all bits in EFLAGS to 1.  (0xffffffffffffffff)
>>> 7. Dump the previous thread's registers again.
>>
>> Basically I think it would be much easier to follow if that was expressed in some sort of pseudocode naming specific functions / seh handlers etc.
>>
>>
>>> Its output on Windows 10 (20H2) is:
>>>
>>>> SharedUserData.SystemCall = 0000000000
>>>>
>>>> Before set context:
>>>> EFlags = 0x0000000000000246  R11 = 0x0123456789abcdef
>>>>     RIP = 0x00007ffa09e504d4  RCX = 0x0000000000000088
>>>>     RSP = 0x0000000000ccfef8  R10 = 0xdeadbeef5a5a5a5a
>>>>
>>>> After set context:
>>>> EFlags = 0x0000000000210fd5  R11 = 0x0123456789abcdef
>>>>     RIP = 0x00007ffa09e504d4  RCX = 0x0000000000000088
>>>>     RSP = 0x0000000000ccfef8  R10 = 0xdeadbeef5a5a5a5a
>>>  From this we can observe the following:
>>>
>>> A. KiFastSystemCall doesn't clear bit 1 in R11 by itself.
>>>     Rather, it's the job of NtSetContextThread.
>>
>> Yeah, regardless of how this is concluded from a test it looks like cleaner way to do it to me.
>>
>>
>>>
>>> B. KiFastSystemCall ignores registers clobbered by the SYSCALL instruction.
>>>     It does try to pretend that the 1st argument is being passed to RCX,
>>>     which leaves the actual 1st argument register (R11) unmodified in CONTEXT.
>>>     (Also note that this implies the presence of a flag in the real kernel
>>>      that records whether R10/R11 are set to valid values or not.  Otherwise,
>>>      the kernel would be unable to use SYSRET since R11 != RFLAGS, etc.)
>>
>>> C. Other entrances to kernel (e.g. a page fault) do record all registers.
>>>     These values are preserved until the next time the thread switches to
>>>     kernel mode.
>>>
>> Again, it would be better to clarify what is meant by page fault but from this context I suspect you mean access violation (or other exception)? In Wine this is handled separately, the syscall_dispatcher() is not involved neither for entering the Unix part (that is done from signal) nor from continuing to user mode (that is done by setting registers from the signal handler and returning right to user mode ntdll entry point). Do you mean your test suggest some modifications to this part? That might be interesting to know but I didn't have an intention to cover all of that at once.
>>
> 

-- 
Sincerely,
Jinoh Kang



More information about the wine-devel mailing list