[PATCH] msvcrt: SSE2 implementation of memcmp for x86_64.

Jin-oh Kang jinoh.kang.kr at gmail.com
Sat Apr 2 05:51:11 CDT 2022


Wouldn't it make much more sense if we simply copied optimized copy
routines from other libc implementations? They have specialised
implementations for various architectures and microarchitectures (e.g.
cache line size), not to mention the performance enhancements that have
accumulated over time.

Also worth noting is that Wine is licensed under LGPL, which makes it
compatible with most open-source libcs out there. Basically what we would
need is some ABI adaptations, such as calling convention adjustment and SEH.

Another option is to just call system libc routines directly, although in
this case it might interfere with stack unwinding, clear PE/unix
separation, and msvcrt hotpatching.

On Sat, Apr 2, 2022, 1:45 PM Elaine Lefler <elaineclefler at gmail.com> wrote:

> On Fri, Apr 1, 2022 at 7:13 AM Jan Sikorski <jsikorski at codeweavers.com>
> wrote:
> >
> > Signed-off-by: Jan Sikorski <jsikorski at codeweavers.com>
> > ---
> > It's about 13x faster on my machine than the byte version.
> > memcmp performance is important to wined3d, where it's used to find
> > pipelines in the cache, and the keys are pretty big.
>
> Should be noted that SSE2 also exists on 32-bit processors, and in
> this same file you can find usage of "sse2_supported", which would
> enable you to use this code path on i386. You can put
> __attribute__((target("sse2"))) on the declaration of sse2_memcmp to
> allow GCC to emit SSE2 instructions even when the file's architecture
> forbids it.
>
> I think this could be even faster if you forced ptr1 to be aligned by
> byte-comparing up to ((p1 + 15) & ~15) at the beginning. Can't
> reasonably force-align both pointers, but aligning at least one should
> give measurably better performance.
>
> I have a similar patch (labelled 230501 on
> https://source.winehq.org/patches/ - not sure how to link the whole
> discussion, sorry) which triggered a discussion about duplication
> between ntdll and msvcrt. memcmp is also a function that appears in
> both dlls. Do you have any input on that? (sorry if I'm out of line
> for butting in here. I just noticed we're working on the same basic
> thing)
>
> - Elaine
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.winehq.org/pipermail/wine-devel/attachments/20220402/7a726bb7/attachment.htm>


More information about the wine-devel mailing list