[PATCH v2 3/3] msvcrt: Add an SSE2 memset_aligned_32 implementation.
Piotr Caban
piotr.caban at gmail.com
Tue Sep 14 06:26:58 CDT 2021
On 9/14/21 1:01 PM, Rémi Bernon wrote:
> And what about using intel intrinsics? Like for instance:
>
>> #ifdef __SSE2__
>> #ifdef __i386__
>> if (sse2_supported)
>> #endif
>> {
>> __m128i x = _mm_set1_epi64x(v);
>> while (n >= 64)
>> {
>> _mm_store_si128((__m128i *)(d + n - 64), x);
>> _mm_store_si128((__m128i *)(d + n - 48), x);
>> _mm_store_si128((__m128i *)(d + n - 32), x);
>> _mm_store_si128((__m128i *)(d + n - 16), x);
>> n -= 64;
>> }
>> if (n >= 32)
>> {
>> _mm_store_si128((__m128i *)(d + n - 32), x);
>> _mm_store_si128((__m128i *)(d + n - 16), x);
>> }
>> return;
>> }
>> #endif
>
> In all cases, if SSE is disabled at compile-time it will not be able to
> use SSE2 path at runtime, even if supported. Which was possible with the
> assembly function.
>
> Is this something we would like to have?
I don't think this is portable. I quick test shows that it doesn't
compile with x86_64-w64-mingw on my machine.
Thanks,
Piotr
More information about the wine-devel
mailing list