[PATCH v2 0/4] MR375: ntdll: Fixes for runtime 64-bit shift functions.

Zebediah Figura (she/her) zfigura at codeweavers.com
Wed Jul 6 22:31:10 CDT 2022


On 7/6/22 10:50, Jinoh Kang (@iamahuman) wrote:
> Jinoh Kang (@iamahuman) commented about dlls/ntdll/large_int.c:
>>       return udivmod(a, b, NULL);
>>   }
>>   
>> +
>> +LONGLONG __regs__allshl( LONGLONG a, unsigned char b )
>> +{
>> +    const LARGE_INTEGER x = { .QuadPart = a };
>> +    LARGE_INTEGER ret;
>> +
>> +    if (b >= 64)
>> +        return 0;
> It appears that GCC's optimizer is having a hard time dealing with mixing full (64-bit) and partial (32-bit) writes.
> 
> Compare:
> 
> - https://godbolt.org/z/KvdGfr4bY (original)
> 
> With:
> 
> - https://godbolt.org/z/vG5TGrhaY (64-bit return ellided)
> 
> I'd suggest folding the special case into the if statement below.
> We do already lose performance by thunking, but it's still a good idea not to slow down (or, rather, amplify the I-cache usage of) a builtin too much.
> 
> Meanwhile, clang does not appear to suffer from this problem.
> 

It doesn't seem to be about mixing writes (I get the "bad" pattern even 
if I write the high and low parts independently), but rather GCC doesn't 
seem to be able to CSE the zero write to LowPart.

Since it's a simple enough tweak I'll submit a new version that's 
friendlier to gcc codegen.



More information about the wine-devel mailing list