<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<div class="moz-cite-prefix">Regarding memcpy performance, I also
recently came through suboptimal memcpy / memmove performance
while doing perf analysis of Shadow of The Tomb Rider game. While
in that case I did not find memcpy to be responsible for any
sufficient slow down (maybe ~2-3 fps as maximum together with math
functions implementation), it brought attention by consistently
appearing in perf top and taking some measurable CPU time
estimated otherwise.</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">I am attaching a very short test
program. That runs ~7.4s using builtin vcruntime140 here and ~2s
using native vcruntime140 under Wine (compiled as
x86_64-w64-mingw32-gcc ./memcpyperf.c -o memcpyperf).</div>
<div class="moz-cite-prefix"><br>
</div>
<div class="moz-cite-prefix">On 8/14/20 11:27, <a class="moz-txt-link-abbreviated" href="mailto:piotr@codeweavers.com">piotr@codeweavers.com</a>
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:2ac10925-2605-47d1-af97-95325b7b4e89@email.android.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="auto">
<div dir="auto">Hi Fabian,
<div dir="auto"><br>
</div>
<div dir="auto">I'll be back from vacation on Monday
(currently I have very limited internet access). I'll look
on it then.</div>
<div dir="auto"><br>
</div>
<div dir="auto">I'm not sure how complicated the assembly
implementation is but I'm expecting that a separated
assembly file will not be needed. Also, AFAIK, we can't take
the implementation from glibc. It would be also useful to
know how efficient Microsoft implementation is.</div>
<div dir="auto"><br>
</div>
<div dir="auto">Musl also have platform specific
implementation of memove (for i386 and x64) written is
assembly. I bet it should be good enough for Wine.</div>
<div dir="auto"><br>
</div>
<div dir="auto">Thanks,</div>
<div dir="auto">Piotr</div>
</div>
<div><br>
<div class="elided-text">On Aug 12, 2020 23:33, Fabian Maurer
<a class="moz-txt-link-rfc2396E" href="mailto:dark.shadow4@web.de"><dark.shadow4@web.de></a> wrote:<br type="attribution">
<blockquote style="margin:0 0 0 0.8ex;border-left:1px #ccc
solid;padding-left:1ex">
<p dir="ltr">Hello,
<br>
<br>
since msvcrt isn't relying on the standard library
memmove/memcpy anymore,
<br>
there's been a pretty bad performance regression. See
<a class="moz-txt-link-freetext" href="https://bugs.winehq.org/">https://bugs.winehq.org/</a>
<br>
show_bug.cgi?id=49663.
<br>
<br>
For the best performance, and since those memory
operations are pretty common,
<br>
we'd presumably like to optimize them as much as
possible. You might have seen
<br>
my patch for an implementation from musl, although
Zebediah rightfully pointed
<br>
out we might want to opt for the best performance we can
get...
<br>
glibc currently offers the best performance, thanks to
SSE/AVX implementations
<br>
and runtime selection of the best supported path.
<br>
<br>
First, would you have any objections adding specialized
paths written in
<br>
assembly for x86?
<br>
And if we were to add them, would we link against
assembly files, or someway
<br>
transform them into inline assembly? AFAIK, Wine didn't
come with pure
<br>
assembly files yet...
<br>
<br>
If you want, I could set up a few crude benchmarks to
see how different
<br>
versions compare.
<br>
<br>
Regards,
<br>
Fabian Maurer
<br>
<br>
<br>
</p>
</blockquote>
</div>
<br>
</div>
</div>
</blockquote>
<p><br>
</p>
</body>
</html>