[PATCH v2 3/3] msvcrt: Add an SSE2 memset_aligned_32 implementation.
Rémi Bernon
rbernon at codeweavers.com
Tue Sep 14 06:01:19 CDT 2021
On 9/14/21 12:49 PM, Piotr Caban wrote:
> On 9/14/21 11:05 AM, Rémi Bernon wrote:
>> +#ifdef __i386__
> #if defined(__i386__) && defined(__SSE2__)
>> + if (sse2_supported)
>> +#endif
>> + {
>> + unsigned int c = v;
>> + __asm__ __volatile__ (
>
> Thanks,
> Piotr
It doesn't seem to be enough, the inline assembly statement needs to be
guarded when SSE is disabled.
And what about using intel intrinsics? Like for instance:
> #ifdef __SSE2__
> #ifdef __i386__
> if (sse2_supported)
> #endif
> {
> __m128i x = _mm_set1_epi64x(v);
> while (n >= 64)
> {
> _mm_store_si128((__m128i *)(d + n - 64), x);
> _mm_store_si128((__m128i *)(d + n - 48), x);
> _mm_store_si128((__m128i *)(d + n - 32), x);
> _mm_store_si128((__m128i *)(d + n - 16), x);
> n -= 64;
> }
> if (n >= 32)
> {
> _mm_store_si128((__m128i *)(d + n - 32), x);
> _mm_store_si128((__m128i *)(d + n - 16), x);
> }
> return;
> }
> #endif
In all cases, if SSE is disabled at compile-time it will not be able to
use SSE2 path at runtime, even if supported. Which was possible with the
assembly function.
Is this something we would like to have?
--
Rémi Bernon <rbernon at codeweavers.com>
More information about the wine-devel
mailing list