[PATCH v2 0/4] MR375: ntdll: Fixes for runtime 64-bit shift functions.
Zebediah Figura (she/her)
zfigura at codeweavers.com
Wed Jul 6 22:31:10 CDT 2022
On 7/6/22 10:50, Jinoh Kang (@iamahuman) wrote:
> Jinoh Kang (@iamahuman) commented about dlls/ntdll/large_int.c:
>> return udivmod(a, b, NULL);
>> }
>>
>> +
>> +LONGLONG __regs__allshl( LONGLONG a, unsigned char b )
>> +{
>> + const LARGE_INTEGER x = { .QuadPart = a };
>> + LARGE_INTEGER ret;
>> +
>> + if (b >= 64)
>> + return 0;
> It appears that GCC's optimizer is having a hard time dealing with mixing full (64-bit) and partial (32-bit) writes.
>
> Compare:
>
> - https://godbolt.org/z/KvdGfr4bY (original)
>
> With:
>
> - https://godbolt.org/z/vG5TGrhaY (64-bit return ellided)
>
> I'd suggest folding the special case into the if statement below.
> We do already lose performance by thunking, but it's still a good idea not to slow down (or, rather, amplify the I-cache usage of) a builtin too much.
>
> Meanwhile, clang does not appear to suffer from this problem.
>
It doesn't seem to be about mixing writes (I get the "bad" pattern even
if I write the high and low parts independently), but rather GCC doesn't
seem to be able to CSE the zero write to LowPart.
Since it's a simple enough tweak I'll submit a new version that's
friendlier to gcc codegen.
More information about the wine-devel
mailing list