[PATCH 3/4] ntdll: Force virtual memory allocation order.
Paul Gofman
pgofman at codeweavers.com
Wed Aug 19 11:21:24 CDT 2020
On 8/19/20 18:59, Ken Thomases wrote:
> On Aug 19, 2020, at 10:06 AM, Paul Gofman <pgofman at codeweavers.com> wrote:
>> On 8/19/20 17:56, Gabriel Ivăncescu wrote:
>>> Sorry for the late reply, but I was a bit interested in this patch (which doesn't seem to have gone anywhere yet?).
>>>
>>> I think this should also fix bug 38668.
>> I am frankly not quite sure how to move forward with this. I've got a comment from Rémi who rightfully considered the change to big for a one time change. While I do agree with that in principle, I don't see the way how to simplify the change without introducing a more complicated and ugly state in between (I have wrote some more details in the previous e-mail).
>>
>> I also received a comment from Alexandre in the chat that he is sceptical about the idea because it is getting too complicated.
>>
>> For now I don't see any other substantially different and good way to solve the bunch of related bugs. Besides the referenced bugs, it is also known to affect Mortal Kombat 11 and Star Wars: Battlefron II for now.
>>
>> So I am not sure how to proceed from here. Should we just leave these bugs, or is there some direction of how to do it simpler which I am missing, or should I maybe try harder to structure the code in some more clear way?
> I haven't paid really close attention. Is this just about doing bottom-up allocation?
This is the motivation, yeah, to force allocation order (whether
TOP_DOWN is requested or bottom up if not), as a bunch of games assumes
the memory pointers they get from VirtualAlloc() (without _TOP_DOWN flag
of course) to be in some limited range. Some (like MK11) actually want
the pointer to be within pre-Win8 user space area limit (44 bits), some
other (like AION) do assume they get the pointers from lower space, e.
g., AION was fine with pointers in ~16GB range. But it is not clear how
to force even 44 bit limit in a simple way (which won't completely solve
the issue anyway), as simply reducing the user_space_limit leads to
allocating of TBs of virtual memory in reserved space before mmap()
starts returning pointers from the range we need.
> For the non-MAP_FIXED case, if you supply a non-zero address to mmap(), don't most systems take that as a hint and try to allocate there?
Yes, they do. We never try to allocate the space where Wine has
allocated it previously, but mmap's called from host libraries are out
of our control.
> If they can't, do they then search from there for the closest available free region?
Linux doesn't, just allocates the same way as no hint was given (bottom up).
There is /proc/sys/vm/legacy_va_layout which, when turned on, will
trigger the use of legacy VM allocator in kernel which was bottom up.
That could potentially solve the problem with the games (while _TOP_DOWN
allocations won't be de-facto supported that is probably less of an
issue). But I would beware switching the system to some many years old
unused allocator, and also that requires root privileges.
>
> macOS behaves like that. Not sure about other platforms. (Then again, macOS has the Mach APIs which allow enumerating the free/allocated regions pretty directly. See the Mac-specific implementation of reserve_area() in libs/wine/mmap.c.)
>
> -Ken
>
There is already the code in Wine which allocates the memory within
non-reserved areas when zero_bits parameter is given to
NtAllocateVirtualMemory() (and some apps are known to depend on this
parameter to work correctly). This code gets an idea which areas are
supposed to be free by surfing through the views, and then does search
the found free area with the same function used in my patches. That has
a performance impact as there is typically much more views than gaps
between then (now when allocations are 64k aligned) and no quick way to
find the free areas (FWIW, Linux kernel uses just the view tree when
searches for a free VM block, but it maintains the info in view which
holds the max block size to the side of the view which allows for binary
search). But this has limited performance impact because zero_bits is
used rarely, switching all the allocations to work this way has
prohibitive performance impact.
There is also the code in Wine which allocates space within reserved
areas by using free block list (maintained for reserved areas only).
My patch combines these two type of allocations by extending free block
tracking for all views allocation, and handling whether the space should
be allocated in reserved area or checked for conflicts with native
mappings in alloc_area_in_reserved_or_between_callback(). The last patch
also "lazily" marks natively mapped area as free which eliminates
performance impact previously noticed without this.
More information about the wine-devel
mailing list