[PATCH v3 2/2] gdi32: Use a lazy-init lookup cache when looking up RGB values for a color table.

Gabriel Ivăncescu gabrielopcode at gmail.com
Fri Apr 16 07:36:52 CDT 2021


On 16/04/2021 13:17, Huw Davies wrote:
> On Thu, Apr 08, 2021 at 04:11:04PM +0300, Gabriel Ivăncescu wrote:
>> Signed-off-by: Gabriel Ivăncescu <gabrielopcode at gmail.com>
>> ---
>>   dlls/gdi32/dibdrv/primitives.c | 65 +++++++++++++++++++++++++---------
>>   1 file changed, 49 insertions(+), 16 deletions(-)
>>
>> diff --git a/dlls/gdi32/dibdrv/primitives.c b/dlls/gdi32/dibdrv/primitives.c
>> index 0a5f7e5..be68058 100644
>> --- a/dlls/gdi32/dibdrv/primitives.c
>> +++ b/dlls/gdi32/dibdrv/primitives.c
>> @@ -3497,22 +3497,48 @@ static void convert_to_16(dib_info *dst, const dib_info *src, const RECT *src_re
>>       }
>>   }
>>   
>> -static inline BOOL color_tables_match(const dib_info *d1, const dib_info *d2)
>> +/*
>> + * To lookup RGB values into nearest color in the color table, Windows uses 5-bits of the RGB
>> + * at the "center" of the RGB cube, presumably to do a similar lookup cache. The lowest 3 bits
>> + * of the color are thus set to halfway (0x04) and then it's used in the distance calculation
>> + * to the exact color in the color table. We exploit this as well to create a lookup cache.
>> +*/
>> +struct rgb_lookup_colortable_ctx
>> +{
>> +    const dib_info *dib;
>> +    BYTE map[32768];
>> +    BYTE valid[32768 / 8];
>> +};
>> +
>> +static void rgb_lookup_colortable_init(const dib_info *dib, struct rgb_lookup_colortable_ctx *ctx)
>>   {
>> -    if (!d1->color_table || !d2->color_table) return (!d1->color_table && !d2->color_table);
>> -    return !memcmp(d1->color_table, d2->color_table, (1 << d1->bit_count) * sizeof(d1->color_table[0]));
>> +    ctx->dib = dib;
>> +    memset(ctx->valid, 0, sizeof(ctx->valid));
>>   }
>>   
>> -static inline DWORD rgb_lookup_colortable(const dib_info *dst, BYTE r, BYTE g, BYTE b)
>> +static inline BYTE rgb_lookup_colortable(struct rgb_lookup_colortable_ctx *ctx, BYTE r, BYTE g, BYTE b)
>>   {
>> -    /* Windows reduces precision to 5 bits, probably in order to build some sort of lookup cache */
>> -    return rgb_to_pixel_colortable( dst, (r & ~7) + 4, (g & ~7) + 4, (b & ~7) + 4 );
>> +    unsigned pos = (r >> 3) | (g & ~7) << 2 | (b & ~7) << 7;
>> +
>> +    if (!(ctx->valid[pos / 8] & (1 << pos % 8)))
>> +    {
>> +        ctx->valid[pos / 8] |= 1 << pos % 8;
>> +        ctx->map[pos] = rgb_to_pixel_colortable(ctx->dib, (r & ~7) + 4, (g & ~7) + 4, (b & ~7) + 4);
>> +    }
>> +    return ctx->map[pos];
>> +}
> 
> I've sent in v4 of this series in which I've tweaked this a bit
> to give the compiler more of a chance to optimize things as well
> as using a lookup for the pixel masks.
> 
> With a 300x300 32-bpp -> 8-bpp BitBlt I'm getting performance slighty
> better than Windows if there aren't many distinct RGB values and
> a little worse if the entire map needs filling.
> 
> Huw.
> 

Looks good, thanks.



More information about the wine-devel mailing list