[PATCH] ntdll: Optimize memcpy for x86-64.
Rémi Bernon
rbernon at codeweavers.com
Wed Mar 30 12:58:08 CDT 2022
On 3/30/22 19:46, Gabriel Ivăncescu wrote:
> On 30/03/2022 19:33, Rémi Bernon wrote:
>> On 3/30/22 17:27, Jinoh Kang wrote:
>>> On 3/23/22 10:33, Elaine Lefler wrote:
>>>> Signed-off-by: Elaine Lefler <elaineclefler at gmail.com>
>>>> ---
>>>>
>>>> New vectorized implementation improves performance up to 65%.
>>>
>>> MSVCRT has one. Maybe deduplicate?
>>>
>> IIUC upstream isn't very interested in assembly optimized routine,
>> unless really necessary.
>>
>> The msvcrt implementation was probably necessary because it's often
>> called by apps, and needs to be as optimal as possible, but I'm not
>> sure ntdll memcpy is used so much. Maybe for realloc though, in which
>> case it might be useful indeed.
>>
>> I think an unrolled version like was done for memset should already
>> give good results and should work portably (though I got bitten with
>> memset already, and I wasn't very keen on trying again with memcpy so
>> soon).
>>
>
> Why not just copy pasting it from msvcrt since it's already done?
Copying the C version? Sure, why not, though looking at it it feels
unnecessarily complex.
--
Rémi Bernon <rbernon at codeweavers.com>
More information about the wine-devel
mailing list