[PATCH v4 1/8] server: Actually set initial status in set_async_direct_result handler.

Jinoh Kang jinoh.kang.kr at gmail.com
Sat Mar 5 01:52:24 CST 2022


On 3/5/22 09:13, Zebediah Figura wrote:
> On 3/4/22 13:04, Jinoh Kang wrote:
>> On 3/5/22 02:47, Zebediah Figura wrote:
>>> Right, but I think it's possible to trigger this assert now. For a rather pathological case, consider a client which calls set_async_direct_result on an async which is currently being serviced by a device driver and has not yet reached get_next_device_request.
>>
>> For that to be possible, something has to terminate the async so that set_async_direct_result does not reject it.
>> The easiest way to do so is to cancel it, or make it timeout somehow.
> 
> Okay, I don't think that's possible for a device async. I'm still more comfortable removing the assert in this patch, though; I'm not sure that there isn't a way to wrangle a server crash...

Understood.

> 
>>
>> In fact, I think we have discovered an _existing_ bug: the client can use set_async_direct_result to interfere with cancelled asyncs (device-backed or not).
> 
> Sure, but the client is basically considered free to sabotage itself.

My concern was that it might interfere with the driver's assumptions; looks like it shouldn't happen, though.

> We just can't let it crash the server, or mess with an object that it doesn't have the right access bits to.
> 
>>
>> Further investigation revealed some other bugs, which is tangentially related:
>>
>> 1. An async may time out before the client calls set_async_direct_result to report back the status of initial I/O.  In this case, the timeout is ignored.
> 
> I don't think this is a problem /a priori/, though.

The problem is that a timeout ignored *once* is ignored *forever* throughout the async's lifetime.
We don't restore the timeout at all once the async enters the pending state.

An example scenario is as follows:

1. Client calls recv_socket.
2. Server sets async timeout and returns.
3. The timeout is triggered, but the async is already in terminated state; thus, it is ignored.
4. Client calls set_async_direct_result with status = STATUS_PENDING.
5. Now, the async will potentially linger forever now that the timeout has already been expired.

> In fact, I'm somewhat concerned that if we let synchronous I/O time out before it even had a chance to start, we would break applications that expect it to work.
> 
>> 2. There is a time window where alerted asyncs ignore IoCancel/IoCancelEx requests.  If async_terminate( async, STATUS_ALERTED ) is called, the async will ignore any cancellation requests until the APC_ASYNC_IO routine returns.
>>     Note that this bug has existed *before* set_async_direct_result, and I think this one may need to be addressed first.
> 
> Yes, this has been a bug for a while. I don't remember if this was a problem before the async rework in 7.0. We probably need to "buffer" the cancellation request somehow, and process it once the async is restarted. (As above, I don't think we want to cancel the async if the client is going to report success to us.)

Sounds like a great idea.  Not sure which one is the best way to implement it, though.  A few options come to my mind:

- Allow async_terminate() to override status if async->alerted is set, even if it is already terminated.
- Add a separate status field for buffering cancellation requests.

> 
>>
>> So what I think we need here is a mechanism to tell alerted asyncs apart from irreversibly-terminated asyncs.
>> This will solve all of the problems above:
>>
>> - It's no longer possible to call set_async_direct_result on a device async in the first place, since it will now reject irreversibly-terminated asyncs while still accepting alerted asyncs (which is not possible for device asyncs).
>> - Timeout and cancellation works as expected: set_async_direct_result will treat timed out asyncs as irreversibly-terminated.  (the client needs to be modified to handle this case explicitly, though.)
>>
>> Any thoughts?
> 
> Maybe I'm missing something, but isn't that what the "alerted" field is for?

You're right. Sorry for missing that.  Also, the cancellation buffering approach you suggested makes this unnecessary anyway.

-- 
Sincerely,
Jinoh Kang



More information about the wine-devel mailing list