[PATCH] msvcrt: Avoid disallowed unaligned writes in memset on ARM

Martin Storsjö martin at martin.st
Thu Sep 16 04:36:53 CDT 2021


On Thu, 16 Sep 2021, Piotr Caban wrote:

> Hi Martin,
>
> On 9/15/21 10:27 PM, Martin Storsjo wrote:
>> ARM can do 64 bit writes with the STRD instruction, but that
>> instruction requires a 32 bit aligned address - while these stores
>> are unaligned.
>> 
>> Two consecutive stores to uint32_t* pointers can also be fused
>> into one single STRD, as a uint32_t* is supposed to be properly
>> aligned - therefore, do these stores as stores to volatile uint32_t*
>> to avoid fusing them.
> How about letting the compiler know that the pointers are unaligned instead? 
> Is attached patch working for you?

Thanks, that's even better!

This way the compiler has more freedom to reason about it and can choose 
to use another instruction with less alignment requirements (both GCC and 
Clang seem to compile it to use a 16 byte VST, an unaligned SIMD store 
instead) which probably is much better than forcing the compiler to do a 
long sequence of 32 bit stores.

Clang doesn't seem to know/exploit that the regular 32 bit store 
instructions work unaligned though, so the smaller stores get exploded 
into a long series of single byte writes. But I guess that's just a missed 
optimization opportunity in Clang, I'll see if I can report it.

// Martin




More information about the wine-devel mailing list