ntdll: Fixed some heap allocation stalls

Steaphan Greene steaphan at gmail.com
Sat Nov 3 11:55:54 CDT 2012


On 11/03/2012 09:04 AM, Matteo Bruni wrote:
> 2012/11/2 Steaphan Greene<steaphan at gmail.com>:
>> Running a game in wine showed it performing terribly.  I traced this to the
>> fact that it allocates and deallocates tiny memory chunks over and over (I
>> suspect it's in C++ and passing things by value everywhere).  This led to
>> huge stalls because the heap bins weren't fine-grained enough (these
>> differed in size in steps of 8 bytes, but the bins differed by 16+, so it
>> spent a lot of time searching each bin for a bigger block).  I added more
>> fine-grained sizes to the smaller end of this, and now it runs faster in
>> wine than it does natively. :)
>>
>> This was run on Debian squeeze, amd64.
>>
>> Note, this is my first submission to wine in nearly 15 years.  So, of
>> course, everything has changed with how this works now.  Hope I have this
>> all right.
>>
>> ---
>>   dlls/ntdll/heap.c |    4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/dlls/ntdll/heap.c b/dlls/ntdll/heap.c
>> index a9044714..eb7406b 100644
>> --- a/dlls/ntdll/heap.c
>> +++ b/dlls/ntdll/heap.c
>> @@ -116,7 +116,9 @@ C_ASSERT( sizeof(ARENA_LARGE) % LARGE_ALIGNMENT == 0 );
>>   /* Max size of the blocks on the free lists */
>>   static const SIZE_T HEAP_freeListSizes[] =
>>   {
>> -    0x10, 0x20, 0x30, 0x40, 0x60, 0x80, 0x100, 0x200, 0x400, 0x1000, ~0UL
>> +    0x10, 0x18, 0x20, 0x28, 0x30, 0x38, 0x40, 0x48, 0x50, 0x58, 0x60, 0x68,
>> +    0x70, 0x78, 0x80, 0x88, 0x90, 0x98, 0xA0, 0xA8, 0xB0, 0xB8, 0xC0, 0xC8,
>> +    0xD0, 0xD8, 0xE0, 0E88, 0xF0, 0xF8, 0x100, 0x200, 0x400, 0x1000, ~0UL
>>   };
>>   #define HEAP_NB_FREE_LISTS  (sizeof(HEAP_freeListSizes)/sizeof(HEAP_freeListSizes[0]))
> That 0E88 looks quite wrong ;)
>
> Apart from that, although I'm not an expert for this code, this patch
> looks reasonable to me. Maybe we don't want so many free lists, but I
> don't see big downsides for that (e.g. the linear search can be
> replaced by a binary search, if need be). Maybe adding a smaller list
> at the start (e.g. 0x8) makes sense too?

Yep, that's a typo.  Don't know how that got past me.  Sorry.  Should I 
resend a corrected version?

0x08 was left off because these values include the overhead of the arena 
info, so the smallest requested size of 8 actually searches for larger 
than 8 (12, I think).

I did try with fewer (every 16 instead of every 8), and, though it was 
still a dramatic improvement, it was still slow.

-- 
Steaphan Greene<steaphan at gmail.com>




More information about the wine-devel mailing list