[PATCH] msvcrt: Avoid disallowed unaligned writes in memset on ARM

Martin Storsjö martin at martin.st
Thu Sep 16 05:01:52 CDT 2021


On Thu, 16 Sep 2021, Martin Storsjö wrote:

> On Thu, 16 Sep 2021, Piotr Caban wrote:
>
>> Hi Martin,
>> 
>> On 9/15/21 10:27 PM, Martin Storsjo wrote:
>>> ARM can do 64 bit writes with the STRD instruction, but that
>>> instruction requires a 32 bit aligned address - while these stores
>>> are unaligned.
>>> 
>>> Two consecutive stores to uint32_t* pointers can also be fused
>>> into one single STRD, as a uint32_t* is supposed to be properly
>>> aligned - therefore, do these stores as stores to volatile uint32_t*
>>> to avoid fusing them.
>> How about letting the compiler know that the pointers are unaligned 
>> instead? Is attached patch working for you?
>
> Thanks, that's even better!
>
> This way the compiler has more freedom to reason about it and can choose to 
> use another instruction with less alignment requirements (both GCC and Clang 
> seem to compile it to use a 16 byte VST, an unaligned SIMD store instead) 
> which probably is much better than forcing the compiler to do a long sequence 
> of 32 bit stores.
>
> Clang doesn't seem to know/exploit that the regular 32 bit store instructions 
> work unaligned though, so the smaller stores get exploded into a long series 
> of single byte writes. But I guess that's just a missed optimization 
> opportunity in Clang, I'll see if I can report it.

FWIW this seems to be a target specific issue; Clang does optimize it 
correctly for an armv7-linux-gnueabihf target, but not for armv7-windows. 
I'll see about getting that fixed.

// Martin


More information about the wine-devel mailing list