[PATCH v2 1/1] ntdll: Fix reading stale Information from IOSB.

Zebediah Figura zfigura at codeweavers.com
Mon May 30 21:28:39 CDT 2022


On 5/30/22 21:17, Jin-oh Kang wrote:
>> The point of a memory barrier is that if you have two pairs of memory
>> operations (read or write) which must be perceived as happening "in
>> order" across threads, you place a memory barrier in the middle of each
>> pair. That is, if thread 1 executes ops A and B, and thread 2 executes
>> ops C and D, you need a barrier between A and B, but also a barrier
>> between C and D.
>>
>> The following image (stolen and adapted from the Linux kernel
>> documentation) may be illustrative:
>>
>>           CPU 1                       CPU 2
>>           ===============             ===============
>>           WRITE_ONCE(a, 1);           x = READ_ONCE(b);
>>           <write barrier>  <--------> <read barrier>
>>           WRITE_ONCE(b, 2);           y = READ_ONCE(a);
>>
> 
> That said, the *_ONCE macros would be quite helpful for Wine too. Ideally,
> however, we should just use the stdatomic.h primitives whenever possible
> (old GCC has bugs in them though), shimming in our own implementation on
> older compilers.

We've been preferring to use win32 atomic definitions instead.

I believe win32 guarantees that all access to suitably aligned types is 
atomic (not targeting the Alpha has its advantages), so there's no need 
for READ_ONCE or WRITE_ONCE. On the other hand, ReadNoFence() and 
WriteNoFence() also exist in recent winnt.h. They lack documentation, of 
course, and so do ReadAcquire() and WriteRelease()...

>> You need to do this because the compiler—and, more importantly, the
>> CPU—is free to reorder the instructions on *both* sides. If, as in the
>> above example, you need "x == 2" to imply "y == 1", the write barrier
>> alone is not enough to achieve this, because the CPU can *also* reorder
>> the reads.
>>
>> The same applies to atomic operations that aren't just RMW. E.g. a
>> store-release (to Status) would be correct here, and more efficient than
>> a barrier, but it has to be paired with a read-acquire (also of Status)
>> in GetOverlappedResult().
>>
> 
> The problem is that the Wine codebase doesn't have one (yet).  Otherwise an
> acquire/release barrier pair would be much more efficient than a full
> barrier (at least on weak-memory-order architectures).

We don't have any way to encode store-release and write-acquire, but 
they would be easy enough to add that I think it wouldn't delay this 
patch too much.



More information about the wine-devel mailing list