[PATCH vkd3d v5 4/6] vkd3d-shader/hlsl: Perform a copy propagation pass.

Zebediah Figura zfigura at codeweavers.com
Tue Nov 16 11:01:06 CST 2021


On 11/16/21 3:04 AM, Giovanni Mascellani wrote:
> Hi,
> 
> On 15/11/21 15:27, Giovanni Mascellani wrote:
>> +static void copy_propagation_set_value(struct copy_propagation_variable *variable, unsigned int offset,
>> +        unsigned char writemask, struct hlsl_ir_node *node)
>> +{
>> +    static const char swizzle_letters[] = {'x', 'y', 'z', 'w'};
>> +
>> +    unsigned int i;
>> +
>> +    for (i = 0; i < 4; ++i)
>> +    {
>> +        if (writemask & (1u << i))
>> +        {
>> +            TRACE("Variable %s[%d] is written by instruction %p.%c.\n",
>> +                    variable->var->name, offset + i, node, swizzle_letters[i]);
>> +            variable->values[offset + i].node = node;
>> +            variable->values[offset + i].component = i;
>> +        }
>> +    }
>> +}
> 
> After playing with the code a little more, I don't think that's correct
> any more. What we call the "writemask" in struct hlsl_ir_store seems
> more a "write-swizzle", right? The correct code would be:
> 
> j = 0;
> for (i = 0; i < 4; ++i)
> {
>       if (writemask & (1u << i))
>       {
>           ...
>           variable->values[offset + i].component = j++;
>       }
> }
> 
> In other words, if we are storing with a writemask .y, we mean that the
> first register of the source goes in the second register of the
> destination, not that the second register of the source goes in the
> second register of the destination. Otherwise the code generated by the
> constructor float4(1.0, 2.0, 3.0, 4.0) would be wrong:
> 
> trace:hlsl_dump_function:    2:      float | 1.00000000e+00
> trace:hlsl_dump_function:    3:      float | 2.00000000e+00
> trace:hlsl_dump_function:    4:      float | 3.00000000e+00
> trace:hlsl_dump_function:    5:      float | 4.00000000e+00
> trace:hlsl_dump_function:    6:            | = (<constructor-0>.x @2)
> trace:hlsl_dump_function:    7:            | = (<constructor-0>.y @3)
> trace:hlsl_dump_function:    8:            | = (<constructor-0>.z @4)
> trace:hlsl_dump_function:    9:            | = (<constructor-0>.w @5)
> trace:hlsl_dump_function:   10:     float4 | <constructor-0>
> 
> Does any of this make sense?
> 
> Giovanni.
> 

Yes, I believe that is correct. I'm surprised it didn't end up breaking 
any tests...



More information about the wine-devel mailing list