[PATCH v2 4/5] wined3d: Optimize scanning changed shader constants in wined3d_device_apply_stateblock().

Henri Verbeet hverbeet at gmail.com
Fri Feb 21 11:18:17 CST 2020


On Fri, 21 Feb 2020 at 01:11, Matteo Bruni <mbruni at codeweavers.com> wrote:
> +/* Count is the total number of bits in the bitmap (i.e. it doesn't depend on start). */
> +static unsigned int wined3d_bitmap_ffs(const uint32_t *bitmap, unsigned int start, unsigned int count)
> +{
One way to make that more obvious would be to move the "count"
parameter after "bitmap" instead of "start", and call it something
like "bit_count".

> +    mask = start % word_bit_count ? ~((1u << (start - 1) % word_bit_count) - 1) : 0xffffffffu;
"mask = ~0u << (start % word_bit_count);", right?

> +    while (!(map = *ptr & mask))
> +    {
> +        if (++ptr == end)
> +            return ~0u;
> +        mask = ~0u;
> +    }
Since the mask only does something on the first iteration, how about
the following:

    map = *ptr & (~0u << (start % word_bit_count));
    while (!map)
    {
        if (++ptr == end);
            return ~0u;
        map = *ptr;
    }

> +    return (ptr - bitmap) * word_bit_count + wined3d_bit_scan(&map);
> +}
This may not be a problem in practice, but note that this can
potentially return a value >= "count" if "count" is not a multiple of
"word_bit_count".

> +static unsigned int wined3d_bitmap_ffz(const uint32_t *bitmap, unsigned int start, unsigned int count)
> +{
...
> +    while (!(map = ~*ptr & mask))
So this line is the main difference with wined3d_bitmap_ffs().
Assuming it wouldn't have any adverse performance effects, that could
be unified by replacing "~*ptr" with "*ptr ^ xor_mask", with
"xor_mask" being 0 for wined3d_bitmap_ffs() and ~0u for
wined3d_bitmap_ffz().

> +    wined3d_apply_shader_constants(device, NULL, changed->vs_consts_f, WINED3D_MAX_VS_CONSTS_F,
> +            (void *)state->vs_consts_f, sizeof(*state->vs_consts_f),
> +            (wined3d_state_shader_constant_setter)wined3d_device_set_vs_consts_f);
>
This works, but is a little messy. How do you feel about the following:

    struct wined3d_map_range range;
...
    for (start = 0; ; start = range.offset)
    {
        if (!wined3d_bitmap_get_range(state->vs_consts_f,
WINED3D_MAX_VS_CONSTS_F, start, &range))
            break;
        wined3d_device_set_vs_consts_f(device, range.offset,
range.size, &state->vs_consts_f[range.offset]);
    }

We could conceivably also introduce some kind of
WINED3D_BITMAP_FOR_EACH_RANGE macro, although I suspect it may not be
worth it.



More information about the wine-devel mailing list