[PATCH] msvcrt: Import memmove from musl

Gabriel Ivăncescu gabrielopcode at gmail.com
Wed Aug 26 09:01:04 CDT 2020


On 25/08/2020 20:15, Piotr Caban wrote:
> On 8/22/20 5:10 PM, Gabriel Ivăncescu wrote:
>> I understand `rep movsl` is faster even in the first test than `rep 
>> movsb`?
> No, it was faster in "Non-aligned", "Aligned overlap" and "Non-aligned 
> overlap" tests. In the "Aligned" case the performance was identical no 
> matter if movsb or movsl was used.
> 
> I'm also attaching simple sse2 implementation for comparison. It's 
> faster than the previous one on my machine. I'm also attaching results 
> from running the test on Windows (in VM).
> 
> Thanks,
> Piotr

In most cases, the SSE version performs very well, in fact slightly 
better than the Windows implementation, and does very well for small moves.

Unfortunately, for some reason, it seems it's quite significantly slower 
(20% or more) only on the "non-overlapped" case. Attached results.

Thanks,
Gabriel
-------------- next part --------------
Test ucrtbase implementation
Aligned Elapsed time 2549ms.
Non-aligned Elapsed time 2934ms.
Aligned overlap Elapsed time 2725ms.
Non-aligned overlap Elapsed time 2780ms.
src==dst Elapsed time 2305ms.
Small moves Elapsed time 317ms.
Small moves Elapsed time 315ms.
Small moves Elapsed time 307ms.
correctness test Elapsed time 2169ms.
Test vcruntime140 implementation
Aligned Elapsed time 2511ms.
Non-aligned Elapsed time 2904ms.
Aligned overlap Elapsed time 2697ms.
Non-aligned overlap Elapsed time 2737ms.
src==dst Elapsed time 2304ms.
Small moves Elapsed time 319ms.
Small moves Elapsed time 316ms.
Small moves Elapsed time 302ms.
correctness test Elapsed time 2144ms.
Test msvcr100 implementation
Aligned Elapsed time 3615ms.
Non-aligned Elapsed time 2959ms.
Aligned overlap Elapsed time 3054ms.
Non-aligned overlap Elapsed time 2827ms.
src==dst Elapsed time 2386ms.
Small moves Elapsed time 114ms.
Small moves Elapsed time 195ms.
Small moves Elapsed time 329ms.
correctness test Elapsed time 2148ms.
Test msvcrt implementation
Aligned Elapsed time 3641ms.
Non-aligned Elapsed time 2998ms.
Aligned overlap Elapsed time 2822ms.
Non-aligned overlap Elapsed time 2937ms.
src==dst Elapsed time 2407ms.
Small moves Elapsed time 128ms.
Small moves Elapsed time 195ms.
Small moves Elapsed time 327ms.
correctness test Elapsed time 2149ms.
Test assembler implementation
Aligned Elapsed time 3030ms.
Non-aligned Elapsed time 3051ms.
Aligned overlap Elapsed time 2826ms.
Non-aligned overlap Elapsed time 2870ms.
src==dst Elapsed time 2839ms.
Small moves Elapsed time 147ms.
Small moves Elapsed time 221ms.
Small moves Elapsed time 298ms.
correctness test Elapsed time 2152ms.
Test sse2 implementation
Aligned Elapsed time 3653ms.
Non-aligned Elapsed time 3633ms.
Aligned overlap Elapsed time 2558ms.
Non-aligned overlap Elapsed time 2538ms.
src==dst Elapsed time 2536ms.
Small moves Elapsed time 145ms.
Small moves Elapsed time 173ms.
Small moves Elapsed time 121ms.
correctness test Elapsed time 2155ms.
done


More information about the wine-devel mailing list