[PATCH 1/5] d3dx9: Avoid casting each value separately in set_constants().

Fri Jun 9 14:14:49 CDT 2017

2017-06-09 20:55 GMT+02:00 Paul Gofman <gofmanp at gmail.com>:
> On 06/09/2017 08:49 PM, Matteo Bruni wrote:
>>
>>
>>> +{
>>> +    unsigned int i;
>>> +    const float *in_bool = in;
>>> +    int *out_int = out;
>>> +
>>> +    for (i = 0; i < count; ++i)
>>> +        out_int[i] = !!in_bool[i];
>>
>> Is the double negation necessary here (or above)?
>
> In the current implementation such a conversion ended up in get_bool() from
> utils.c, which did "*(DWORD *)data != 0" for BOOLs also. I did not want to
> introduce side changes in this purely optimization patch.

Sounds fair.

>>> +    {
>>> +        {NULL,               NULL, pres_float_to_int,
>>> pres_value_to_bool},
>>> +        {NULL,               NULL, NULL,              NULL},
>>> +        {pres_int_to_float,  NULL, NULL,
>>> pres_value_to_bool},
>>> +        {pres_bool_to_float, NULL, pres_bool_to_int,  NULL}
>>> +    };
>>> +    enum pres_value_type table_type = table_info[table].type;
>>> +
>>> +    if (table_type == param_type)
>>> +    {
>>> +        regstore_set_values(rs, table, in, offset, count);
>>> +        return;
>>> +    }
>>
>> IIUC in this case regstore_set_values() will have in == out. That's a
>> problem since it means memcpy()ing the data onto itself, which is
>> undefined. It's also unnecessary, regstore_set_modified() would be
>> enough, but only if in == out which is something the current
>> regstore_set_data() interface doesn't enforce (and indeed patch 2/5
>> doesn't respect).
>
> Oh yes, I actually want regstore_set_modified() here. In the next patches
> (which I did not send yet) where I get rid of modified bitmasks entirely
> this just goes away completely.

That would break in patch 2/5 (if it isn't updated accordingly)
though, as I mentioned.

>>>           for (element = 0; element < const_set->element_count;
>>> ++element)
>>>           {
>>> +            unsigned int *out = (unsigned int *)rs->tables[table] +
>>> start_offset;
>>
>> Aside from the issue I mentioned above, this works only because all
>> the formats we're interesting in converting from / to happen be 4
>> bytes wide. It's somewhat assumed in a bunch of places already but
>> adding more isn't too exciting. I'd like at least an assert checking
>> for the precondition, if possible.
>
> It was assumed exactly the same way before this patch, there was 'unsigned
> int out' variable which was holding conversion result for any type, so this
> patch does not change nothing.

No big difference in practice but it's quite a bit more awkward in
that you're temporarily casting e.g. the float table to unsigned int,
then finally converting data to float, in place.

> I can add an assertion of course, but this is
> a sort of static assert. Can I do it as #if ... #error, or place it in
> initialization? Don't you think it will look a bit ugly to check the size of
> the list of predefined types in the code? Or am I misunderstanding entirely
> how do you mean to assert this?

The assertion I envision would check that both the source and the
destination data type have size == sizeof(unsigned int), which is what
you're depending on. It could be a static assert (see e.g.
C_ASSERT()), in principle.

>>> @@ -1083,23 +1155,15 @@ static void set_constants(struct d3dx_regstore
>>> *rs, struct d3dx_const_tab *const
>>>                           WARN("Parameter data is too short, name %s,
>>> component %u.\n", debugstr_a(param->name), i);
>>>                           break;
>>>                       }
>>> -
>>> -                    in = (unsigned int *)data + param_offset;
>>> -                    switch (table_type)
>>> -                    {
>>> -                        case PRES_VT_FLOAT: set_number(&out,
>>> D3DXPT_FLOAT, in, param->type); break;
>>> -                        case PRES_VT_INT: set_number(&out, D3DXPT_INT,
>>> in, param->type); break;
>>> -                        case PRES_VT_BOOL: set_number(&out, D3DXPT_BOOL,
>>> in, param->type); break;
>>> -                        default:
>>> -                            FIXME("Unexpected type %#x.\n",
>>> table_info[table].type);
>>> -                            break;
>>> -                    }
>>> -                    regstore_set_values(rs, table, &out, offset, 1);
>>> +                    out[offset] = data[param_offset];
>>
>> I think separating the matrix transposing / "reshaping" from the data
>> conversion is okay, if that allows further changes and optimizations
>> like patch 2/5. You might want to add a small comment around here to
>> clarify the split.
>
> Well, it is already a huge optimization even without further patches, as it
> removes switch and function call for each value being set, and does this
> just once for a set of values. Should I add some comment like /* Do the
> conversion once for the array of values. */?

In that point of the code the "interesting" bit to note is that you're
storing the "reshaped", but not converted, data temporarily in the
final storage. Which isn't nice either way, but at least a comment
reduces the surprise factor.

>>>                   }
>>>               }
>>>               start_offset += get_offset_reg(table,
>>> const_set->register_count);
>>>               data += param->rows * param->columns;
>>>           }
>>> +        start_offset = get_offset_reg(table, const_set->register_index);
>>> +        regstore_set_data(rs, table, start_offset, (unsigned int
>>> *)rs->tables[table] + start_offset,
>>> +                get_offset_reg(table, const_set->register_count) *
>>> const_set->element_count, param_type);
>>
>> I understand that you want to avoid to use some separate storage for
>> the "to be converted" data but, like above, this regstore table
>> pointer cast makes me a bit uncomfortable.
>>
> Or yeah, I really want to avoid extra storage for conversion, it adds extra
> memory access and affects performance a bit (I had such variant as a first
> time implementation). Same as above, it is essentially the same as it was
> before, just previously it was a single "unsigned int out;" while now it is
> an array. Maybe then I will add runtime check (or preprocessor check) in
> initialization that all the supported types really have the same size as
> 'unsigned int'?

Yeah, that's the "assert" I want. Not sure what's the best form for
that, it doesn't necessarily have to be an assert().