[PATCH] msvcrt: Avoid disallowed unaligned writes in memset on ARM
Martin Storsjö
martin at martin.st
Thu Sep 16 05:01:52 CDT 2021
On Thu, 16 Sep 2021, Martin Storsjö wrote:
> On Thu, 16 Sep 2021, Piotr Caban wrote:
>
>> Hi Martin,
>>
>> On 9/15/21 10:27 PM, Martin Storsjo wrote:
>>> ARM can do 64 bit writes with the STRD instruction, but that
>>> instruction requires a 32 bit aligned address - while these stores
>>> are unaligned.
>>>
>>> Two consecutive stores to uint32_t* pointers can also be fused
>>> into one single STRD, as a uint32_t* is supposed to be properly
>>> aligned - therefore, do these stores as stores to volatile uint32_t*
>>> to avoid fusing them.
>> How about letting the compiler know that the pointers are unaligned
>> instead? Is attached patch working for you?
>
> Thanks, that's even better!
>
> This way the compiler has more freedom to reason about it and can choose to
> use another instruction with less alignment requirements (both GCC and Clang
> seem to compile it to use a 16 byte VST, an unaligned SIMD store instead)
> which probably is much better than forcing the compiler to do a long sequence
> of 32 bit stores.
>
> Clang doesn't seem to know/exploit that the regular 32 bit store instructions
> work unaligned though, so the smaller stores get exploded into a long series
> of single byte writes. But I guess that's just a missed optimization
> opportunity in Clang, I'll see if I can report it.
FWIW this seems to be a target specific issue; Clang does optimize it
correctly for an armv7-linux-gnueabihf target, but not for armv7-windows.
I'll see about getting that fixed.
// Martin
More information about the wine-devel
mailing list