HLSL offsetting

Zebediah Figura zfigura at codeweavers.com
Fri Jun 10 16:13:13 CDT 2022

On 6/10/22 14:13, Francisco Casas wrote:
> So, as Matteo summarized, we are between 2 main options:
> a) Multiple register offsets.
> b) Component offsets with structured dereference info.
> How I see it, (a) changes hlsl_deref to:
> ---
> enum register_set {
>      /* ... add more as needed, to cover for all SMs. */
> };
> struct hlsl_deref
> {
>      struct hlsl_ir_var *var;
>      struct hlsl_src offset[HLSL_REGSET_COUNT];
> };
> ---
> Also, the types' reg_size becomes reg_size[HLSL_REGSET_COUNT], and so do 
> field offsets. Many functions have to receive an additional register_set 
> argument or an array of offsets instead of a single offset.
> On the other hand, the version of (b) I imagine changes hlsl_deref to:
> ---
> struct hlsl_deref
> {
>      struct hlsl_ir_var *var;
>      unsigned int route_len;
>      struct hlsl_src *route;
> };
> ---
> Where route is intended to be a variable size array of component 
> offsets. 

I was envisioning something more explicit, but this is simpler, so my 
guess is that this is what we want.

> It would make sense to remove reg_size from the types and also 
> field offsets. 

Yes, absolutely.

> Functions that cannot receive structs or arrays may 
> receive a "flattened" component offset that can then be translated into 
> a route, other functions would require the route as an array.

Not immediately sure what functions you're thinking of, but I imagine 
things like hlsl_compute_component_offset() would now have to translate 
the offset into an array.

> At this point I can see the benefits of (b) over (a), but also, several 
> complications that may arise (you have pointed most of them):
> - We will have to translate many things that are already in terms of 
> register offsets into component offsets.

How many things, though? As far as I can see it's:

- copy-prop

- copy splitting

- hlsl_offset_from_deref() [which is part of the point of the whole 

That's pretty much it.

> - We will have to move all optimization passes (like vectorization) that 
> require register offsets to their specific SMxIR, RA too.

I think we want to do RA per backend anyway. (It kind of already is 
per-backend, in a very awkward way.)

Vectorization is the main downside but it's possible that we wanted that 
to be per-backend too.

> - We will have to do the proper translation to register offsets, on each 
> SMxIR level, and probably some sort of constant folding for them.

Sure, but in a sense this is the point. The translation is 
backend-specific (what with alignment and register sets and all) and we 
should structure things accordingly.

hlsl_offset_from_deref() will need constant folding in a sense, but 
it'll be a pretty restricted form thereof.

> - Once we start supporting non-constant offsets, we may also want to 
> introduce a common subexpression elimination pass for the registers 
> offsets expressions that will arise (currently, the creation of common 
> expressions is mainly avoided by the recursive structure of the split 
> passes).

What cases are you thinking of that would want CSE?

> Solely because I have spent a considerable amount of time implementing 
> option (a) (and some time giving up on implementing component offsets as 
> a single hlsl_src, instead of a path) I am rushing (a) to see how the 
> patch turns out in the end, before trying (b).
> I so think that (a) can be cleansed one step at the time. Even if the 
> register sizes and offsets depend on the SM, we can write an interface 
> to abstract the rest of the code of using them directly, and gradually 
> migrate the code that does to use this interface instead.
> But so far, yeah, I am being convinced that (b) is better, however more 
> difficult.
> If we do (b), I suggest we try to do it in separate steps, with some 
> "scaffolding code":
> - Add the component offset route to hlsl_deref without deleting the 
> register offset.
> - Create a simple pass that initializes the register offset field using 
> the route in all the hlsl_derefs.
> - Translate all the parse code first to work with the routes, and apply 
> the pass just after parsing.
> - Translate some more compilation passes and move the translation pass 
> forward.
> - Repeat until all the passes that can be written in terms of component 
> offsets are.
> - Write the SMxIR·s and the SMxIR translations.
> - Only then, remove the the register offset from hlsl_deref and the 
> translation pass from the codebase.
> Otherwise we may end up writing a very big patch that may take too long 
> to complete (!).

Ech, I think that trying to do two things at once is going to be more 
confusing than the alternative. I also don't think there's *that* much 
code that we'd have to change, i.e. a monolithic patch wouldn't be that bad.

For that matter, I'd be happy to try writing those patches myself :-)


More information about the wine-devel mailing list