[PATCH v2 1/1] ntdll: Fix reading stale Information from IOSB.
Zebediah Figura
zfigura at codeweavers.com
Mon May 30 21:10:04 CDT 2022
On 5/30/22 20:23, Jin-oh Kang wrote:
> On Tue, May 31, 2022, 7:52 AM Zebediah Figura wrote:
>
>> On 5/30/22 14:20, Jinoh Kang wrote:
>>> +
>>> + /* GetOverlappedResultEx() skips waiting for the event/file handle
>> if the
>>> + * Status field indicates completion, and returns the result from
>> the
>>> + * Information field.
>>> + *
>>> + * Hence, we have to ensure that writes to the Information field
>> are seen
>>> + * from the other threads *before* writes to the Status field.
>>> + *
>>> + * Otherwise, a race condition results: if writing the Status field
>> has
>>> + * been completed but the subsequent update to the Information
>> field has
>>> + * not, the application may end up reading stale value from it.
>>> + *
>>> + * The race condition can be especially problematic with
>> applications that
>>> + * use the HasOverlappedIoCompleted() macro in a tight loop to
>> busy-wait
>>> + * for completion of overlapped I/O.
>>> + */
>>> + MemoryBarrier(); /* TODO: consider replacing with a store-release.
>> */
>>> + *(volatile NTSTATUS *)status_ptr = status;
>>> }
>>>
>>> static inline client_ptr_t iosb_client_ptr( IO_STATUS_BLOCK *io )
>>
>> Unpaired MemoryBarrier() is pretty much always wrong. (In fact, I think
>> unpaired anything would be, here.)
>>
>
> By pairing, do you mean that another MemoryBarrier() should follow the
> status write?
No. We need another memory barrier in GetOverlappedResult().
The point of a memory barrier is that if you have two pairs of memory
operations (read or write) which must be perceived as happening "in
order" across threads, you place a memory barrier in the middle of each
pair. That is, if thread 1 executes ops A and B, and thread 2 executes
ops C and D, you need a barrier between A and B, but also a barrier
between C and D.
The following image (stolen and adapted from the Linux kernel
documentation) may be illustrative:
CPU 1 CPU 2
=============== ===============
WRITE_ONCE(a, 1); x = READ_ONCE(b);
<write barrier> <--------> <read barrier>
WRITE_ONCE(b, 2); y = READ_ONCE(a);
You need to do this because the compiler—and, more importantly, the
CPU—is free to reorder the instructions on *both* sides. If, as in the
above example, you need "x == 2" to imply "y == 1", the write barrier
alone is not enough to achieve this, because the CPU can *also* reorder
the reads.
The same applies to atomic operations that aren't just RMW. E.g. a
store-release (to Status) would be correct here, and more efficient than
a barrier, but it has to be paired with a read-acquire (also of Status)
in GetOverlappedResult().
More information about the wine-devel
mailing list