[PATCH v2 3/3] msvcrt: Add an SSE2 memset_aligned_32 implementation.

Piotr Caban piotr.caban at gmail.com
Tue Sep 14 06:26:58 CDT 2021


On 9/14/21 1:01 PM, Rémi Bernon wrote:
> And what about using intel intrinsics? Like for instance:
> 
>> #ifdef __SSE2__
>> #ifdef __i386__
>>     if (sse2_supported)
>> #endif
>>     {
>>         __m128i x = _mm_set1_epi64x(v);
>>         while (n >= 64)
>>         {
>>             _mm_store_si128((__m128i *)(d + n - 64), x);
>>             _mm_store_si128((__m128i *)(d + n - 48), x);
>>             _mm_store_si128((__m128i *)(d + n - 32), x);
>>             _mm_store_si128((__m128i *)(d + n - 16), x);
>>             n -= 64;
>>         }
>>         if (n >= 32)
>>         {
>>             _mm_store_si128((__m128i *)(d + n - 32), x);
>>             _mm_store_si128((__m128i *)(d + n - 16), x);
>>         }
>>         return;
>>     }
>> #endif
> 
> In all cases, if SSE is disabled at compile-time it will not be able to 
> use SSE2 path at runtime, even if supported. Which was possible with the 
> assembly function.
> 
> Is this something we would like to have?
I don't think this is portable. I quick test shows that it doesn't 
compile with x86_64-w64-mingw on my machine.

Thanks,
Piotr



More information about the wine-devel mailing list