<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <div class="moz-cite-prefix">Regarding memcpy performance, I also
      recently came through suboptimal memcpy / memmove performance
      while doing perf analysis of Shadow of The Tomb Rider game. While
      in that case I did not find memcpy to be responsible for any
      sufficient slow down (maybe ~2-3 fps as maximum together with math
      functions implementation), it brought attention by consistently
      appearing in perf top and taking some measurable CPU time
      estimated otherwise.</div>
    <div class="moz-cite-prefix"><br>
    </div>
    <div class="moz-cite-prefix">I am attaching a very short test
      program. That runs ~7.4s using builtin vcruntime140 here and ~2s
      using native vcruntime140 under Wine (compiled as
      x86_64-w64-mingw32-gcc ./memcpyperf.c -o memcpyperf).</div>
    <div class="moz-cite-prefix"><br>
    </div>
    <div class="moz-cite-prefix">On 8/14/20 11:27, <a class="moz-txt-link-abbreviated" href="mailto:piotr@codeweavers.com">piotr@codeweavers.com</a>
      wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:2ac10925-2605-47d1-af97-95325b7b4e89@email.android.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="auto">
        <div dir="auto">Hi Fabian,
          <div dir="auto"><br>
          </div>
          <div dir="auto">I'll be back from vacation on Monday
            (currently I have very limited internet access). I'll look
            on it then.</div>
          <div dir="auto"><br>
          </div>
          <div dir="auto">I'm not sure how complicated the assembly
            implementation is but I'm expecting that a separated
            assembly file will not be needed. Also, AFAIK, we can't take
            the implementation from glibc. It would be also useful to
            know how efficient Microsoft implementation is.</div>
          <div dir="auto"><br>
          </div>
          <div dir="auto">Musl also have platform specific
            implementation of memove (for i386 and x64) written is
            assembly. I bet it should be good enough for Wine.</div>
          <div dir="auto"><br>
          </div>
          <div dir="auto">Thanks,</div>
          <div dir="auto">Piotr</div>
        </div>
        <div><br>
          <div class="elided-text">On Aug 12, 2020 23:33, Fabian Maurer
            <a class="moz-txt-link-rfc2396E" href="mailto:dark.shadow4@web.de"><dark.shadow4@web.de></a> wrote:<br type="attribution">
            <blockquote style="margin:0 0 0 0.8ex;border-left:1px #ccc
              solid;padding-left:1ex">
              <p dir="ltr">Hello,
                <br>
                <br>
                since msvcrt isn't relying on the standard library
                memmove/memcpy anymore,
                <br>
                there's been a pretty bad performance regression. See
                <a class="moz-txt-link-freetext" href="https://bugs.winehq.org/">https://bugs.winehq.org/</a>
                <br>
                show_bug.cgi?id=49663.
                <br>
                <br>
                For the best performance, and since those memory
                operations are pretty common,
                <br>
                we'd presumably like to optimize them as much as
                possible. You might have seen
                <br>
                my patch for an implementation from musl, although
                Zebediah rightfully pointed
                <br>
                out we might want to opt for the best performance we can
                get...
                <br>
                glibc currently offers the best performance, thanks to
                SSE/AVX implementations
                <br>
                and runtime selection of the best supported path.
                <br>
                <br>
                First, would you have any objections adding specialized
                paths written in
                <br>
                assembly for x86?
                <br>
                And if we were to add them, would we link against
                assembly files, or someway
                <br>
                transform them into inline assembly? AFAIK, Wine didn't
                come with pure
                <br>
                assembly files yet...
                <br>
                <br>
                If you want, I could set up a few crude benchmarks to
                see how different
                <br>
                versions compare.
                <br>
                <br>
                Regards,
                <br>
                Fabian Maurer
                <br>
                <br>
                <br>
              </p>
            </blockquote>
          </div>
          <br>
        </div>
      </div>
    </blockquote>
    <p><br>
    </p>
  </body>
</html>