[PATCH] msvcrt: Import memmove from musl

Gabriel Ivăncescu gabrielopcode at gmail.com
Sat Aug 22 10:10:53 CDT 2020


On 21/08/2020 20:21, Piotr Caban wrote:
> Hi Gabriel,
> 
> I was experimenting with various attempts of implementing memmove. I'm 
> attaching a modified version of Paul's test application. It compares 
> memmove performance from ucrtbase, msvcr100 and msvcrt dlls. It also 
> contains assembler (i386) implementation of the function.
> 
> Thanks,
> Piotr

Hi Piotr,

Here are the results on a Haswell Xeon E3-1241 v3 CPU (all 32-bit to 
compare with your assembly implementation). I've also added an extra 
test (attached function) that simply uses `rep movsb`.

Quick Summary: Your assembler implementation is very good overall 
compared to the one from Windows 10 (ucrtbase). The only time it is 
significantly slower (10%) is in the "aligned non-overlap" case (the 
first test). In other cases it performs just as well as ucrtbase.

The simple "rep movsb" function I added as a quick test is also faster 
than your assembly implementation for this case only (aligned, 
non-overlapped).

However, it is extremely slow in overlap cases, where we copy backwards. 
I guess the CPU is not optimized for copying backwards with it. On your 
CPU, I understand `rep movsl` is faster even in the first test than `rep 
movsb`?

One last thing worth mentioning is "small moves" case: it seems the 
older runtimes do much better here. I think we can do something 
separately with those, without using movsb/movsl, which I understand 
require some startup time from the CPU to do alignment checks and so on 
before it goes full speed copying at maximum bandwidth.

Here's the entire log:

Test ucrtbase implementation
Aligned Elapsed time 2659ms.
Non-aligned Elapsed time 3004ms.
Aligned overlap Elapsed time 2817ms.
Non-aligned overlap Elapsed time 2871ms.
src==dst Elapsed time 2345ms.
Small moves Elapsed time 310ms.
Small moves Elapsed time 313ms.
Small moves Elapsed time 308ms.
correctness test Elapsed time 2163ms.
Test msvcr100 implementation
Aligned Elapsed time 3674ms.
Non-aligned Elapsed time 2998ms.
Aligned overlap Elapsed time 2808ms.
Non-aligned overlap Elapsed time 2853ms.
src==dst Elapsed time 2397ms.
Small moves Elapsed time 115ms.
Small moves Elapsed time 196ms.
Small moves Elapsed time 328ms.
correctness test Elapsed time 2142ms.
Test msvcrt implementation
Aligned Elapsed time 3669ms.
Non-aligned Elapsed time 2967ms.
Aligned overlap Elapsed time 2829ms.
Non-aligned overlap Elapsed time 2872ms.
src==dst Elapsed time 2410ms.
Small moves Elapsed time 129ms.
Small moves Elapsed time 197ms.
Small moves Elapsed time 332ms.
correctness test Elapsed time 2168ms.
Test assembler implementation
Aligned Elapsed time 2940ms.
Non-aligned Elapsed time 2985ms.
Aligned overlap Elapsed time 2809ms.
Non-aligned overlap Elapsed time 2848ms.
src==dst Elapsed time 2813ms.
Small moves Elapsed time 271ms.
Small moves Elapsed time 491ms.
Small moves Elapsed time 292ms.
correctness test Elapsed time 2156ms.
Test rep movsb implementation
Aligned Elapsed time 2731ms.
Non-aligned Elapsed time 3042ms.
Aligned overlap Elapsed time 5910ms.
Non-aligned overlap Elapsed time 5910ms.
src==dst Elapsed time 5912ms.
Small moves Elapsed time 289ms.
Small moves Elapsed time 287ms.
Small moves Elapsed time 288ms.
correctness test Elapsed time 2181ms.
done
-------------- next part --------------
A non-text attachment was scrubbed...
Name: rep_movsb_memmove.c
Type: text/x-csrc
Size: 375 bytes
Desc: not available
URL: <http://www.winehq.org/pipermail/wine-devel/attachments/20200822/f9990cac/attachment.c>


More information about the wine-devel mailing list