HLSL offsetting
Zebediah Figura
zfigura at codeweavers.com
Fri Jun 10 16:13:13 CDT 2022
On 6/10/22 14:13, Francisco Casas wrote:
> So, as Matteo summarized, we are between 2 main options:
>
> a) Multiple register offsets.
> b) Component offsets with structured dereference info.
>
> How I see it, (a) changes hlsl_deref to:
>
> ---
> enum register_set {
> HLSL_REGSET_OBJ,
> HLSL_REGSET_NUM,
> /* ... add more as needed, to cover for all SMs. */
>
> HLSL_REGSET_COUNT,
> };
>
> struct hlsl_deref
> {
> struct hlsl_ir_var *var;
> struct hlsl_src offset[HLSL_REGSET_COUNT];
> };
> ---
>
> Also, the types' reg_size becomes reg_size[HLSL_REGSET_COUNT], and so do
> field offsets. Many functions have to receive an additional register_set
> argument or an array of offsets instead of a single offset.
>
>
> On the other hand, the version of (b) I imagine changes hlsl_deref to:
> ---
> struct hlsl_deref
> {
> struct hlsl_ir_var *var;
> unsigned int route_len;
> struct hlsl_src *route;
> };
> ---
> Where route is intended to be a variable size array of component
> offsets.
I was envisioning something more explicit, but this is simpler, so my
guess is that this is what we want.
> It would make sense to remove reg_size from the types and also
> field offsets.
Yes, absolutely.
> Functions that cannot receive structs or arrays may
> receive a "flattened" component offset that can then be translated into
> a route, other functions would require the route as an array.
Not immediately sure what functions you're thinking of, but I imagine
things like hlsl_compute_component_offset() would now have to translate
the offset into an array.
> At this point I can see the benefits of (b) over (a), but also, several
> complications that may arise (you have pointed most of them):
> - We will have to translate many things that are already in terms of
> register offsets into component offsets.
How many things, though? As far as I can see it's:
- copy-prop
- copy splitting
- hlsl_offset_from_deref() [which is part of the point of the whole
exercise]
That's pretty much it.
> - We will have to move all optimization passes (like vectorization) that
> require register offsets to their specific SMxIR, RA too.
I think we want to do RA per backend anyway. (It kind of already is
per-backend, in a very awkward way.)
Vectorization is the main downside but it's possible that we wanted that
to be per-backend too.
> - We will have to do the proper translation to register offsets, on each
> SMxIR level, and probably some sort of constant folding for them.
Sure, but in a sense this is the point. The translation is
backend-specific (what with alignment and register sets and all) and we
should structure things accordingly.
hlsl_offset_from_deref() will need constant folding in a sense, but
it'll be a pretty restricted form thereof.
> - Once we start supporting non-constant offsets, we may also want to
> introduce a common subexpression elimination pass for the registers
> offsets expressions that will arise (currently, the creation of common
> expressions is mainly avoided by the recursive structure of the split
> passes).
What cases are you thinking of that would want CSE?
> Solely because I have spent a considerable amount of time implementing
> option (a) (and some time giving up on implementing component offsets as
> a single hlsl_src, instead of a path) I am rushing (a) to see how the
> patch turns out in the end, before trying (b).
>
> I so think that (a) can be cleansed one step at the time. Even if the
> register sizes and offsets depend on the SM, we can write an interface
> to abstract the rest of the code of using them directly, and gradually
> migrate the code that does to use this interface instead.
>
>
> But so far, yeah, I am being convinced that (b) is better, however more
> difficult.
> If we do (b), I suggest we try to do it in separate steps, with some
> "scaffolding code":
>
> - Add the component offset route to hlsl_deref without deleting the
> register offset.
> - Create a simple pass that initializes the register offset field using
> the route in all the hlsl_derefs.
> - Translate all the parse code first to work with the routes, and apply
> the pass just after parsing.
> - Translate some more compilation passes and move the translation pass
> forward.
> - Repeat until all the passes that can be written in terms of component
> offsets are.
> - Write the SMxIR·s and the SMxIR translations.
> - Only then, remove the the register offset from hlsl_deref and the
> translation pass from the codebase.
>
>
> Otherwise we may end up writing a very big patch that may take too long
> to complete (!).
Ech, I think that trying to do two things at once is going to be more
confusing than the alternative. I also don't think there's *that* much
code that we'd have to change, i.e. a monolithic patch wouldn't be that bad.
For that matter, I'd be happy to try writing those patches myself :-)
ἔρρωσθε,
Zeb
More information about the wine-devel
mailing list