[PATCH v2 3/3] msvcrt: Add an SSE2 memset_aligned_32 implementation.

Rémi Bernon rbernon at codeweavers.com
Tue Sep 14 06:01:19 CDT 2021


On 9/14/21 12:49 PM, Piotr Caban wrote:
> On 9/14/21 11:05 AM, Rémi Bernon wrote:
>> +#ifdef __i386__
> #if defined(__i386__) && defined(__SSE2__)
>> +    if (sse2_supported)
>> +#endif
>> +    {
>> +        unsigned int c = v;
>> +        __asm__ __volatile__ (
> 
> Thanks,
> Piotr

It doesn't seem to be enough, the inline assembly statement needs to be 
guarded when SSE is disabled.

And what about using intel intrinsics? Like for instance:

> #ifdef __SSE2__
> #ifdef __i386__
>     if (sse2_supported)
> #endif
>     {
>         __m128i x = _mm_set1_epi64x(v);
>         while (n >= 64)
>         {
>             _mm_store_si128((__m128i *)(d + n - 64), x);
>             _mm_store_si128((__m128i *)(d + n - 48), x);
>             _mm_store_si128((__m128i *)(d + n - 32), x);
>             _mm_store_si128((__m128i *)(d + n - 16), x);
>             n -= 64;
>         }
>         if (n >= 32)
>         {
>             _mm_store_si128((__m128i *)(d + n - 32), x);
>             _mm_store_si128((__m128i *)(d + n - 16), x);
>         }
>         return;
>     }
> #endif

In all cases, if SSE is disabled at compile-time it will not be able to 
use SSE2 path at runtime, even if supported. Which was possible with the 
assembly function.

Is this something we would like to have?
-- 
Rémi Bernon <rbernon at codeweavers.com>



More information about the wine-devel mailing list