[PATCH vkd3d 2/5] vkd3d-shader/hlsl: Perform a copy propagation pass.

Thu Nov 11 04:26:08 CST 2021

On Wed, Nov 10, 2021 at 5:33 PM Zebediah Figura (she/her)
<zfigura at codeweavers.com> wrote:
>
> On 11/10/21 03:57, Giovanni Mascellani wrote:
> > Hi,
> >
> > On 09/11/21 23:21, Zebediah Figura wrote:
> >> Overall this patch looks a lot better than I was afraid of. It's a lot
> >> of code that's intimidating to review, but once you ignore the rbtree
> >> boilerplate it's simple enough and seems about in line with what I
> >> expect. There's quite a few things that I think can be simplified, but
> >> the basic structure seems sound, so nice work.
> >
> > Great, that's good! Thanks for your review, I think most if not all of
> > the points you wrote should be easy to fix.
> >
> >> The CF part is probably going to be a lot trickier, but fortunately I
> >> think that this patch alone is the one that really matters. Frankly,
> >> if we were to take this patch, and then a second patch that does
> >> copy-prop on interior CF blocks with a fresh copy_propagation_state(),
> >> we might even cover enough that it's not even worrying about CF...
> >
> > Mmh, note that you have to handle control flow in some way or another,
> > you cannot just ignore it. This patch doesn't know about control flow,
> > but the only way it can do it is by bailing out at the first control
> > flow instruction in the program. It's not enough to ignore the control
> > flow block and resume processing after it, because the inner block might
> > invalidate some of your knowledge about the program state (in 3/5 patch
> > this is done by copy_propagation_invalidate_from_block).
>
> Right. I should have said to invalidate the whole copyprop state both
> when entering a CF block and when leaving one. The point is that we can
> do copyprop within every BB, not just the first one, without having to
> worry about things like partial invalidation.

Yeah, that would probably be good enough for our purposes for a long time.

> >>> +/* struct copy_propagation_state represents the accumulated knowledge
> >>> + * of the copy propagation pass while it scans through the code. Field
> >>> + * "variables" is a tree whose elements have type struct
> >>> + * copy_propagation_varible, and represent each of the variables the
> >>> + * pass has already encountered (except those with special semantics,
> >>> + * which are ignored). For each variable, the array "values" (whose
> >>> + * length is the register size of the variable) represent which node
> >>> + * and which index inside that node (i.e., which of the at most four
> >>> + * entries of a vector) provided that value last time. Field "node"
> >>> + * can be NULL, meaning that the pass was not able to statically
> >>> + * determine the node.
> >>> + */
> >>
> >> This comment feels a bit too low-level? I dunno, comments like this
> >> are hard to review, but my inclination is to describe what you're
> >> doing at a high level, along the lines of "we track the last known
> >> value of each component of each variable", and let the code itself
> >> explain the details.
> >
> > Personally I would have loved to meet comments like mine in other areas
> > of Wine: knowing what data read or written by an algorithm are meant to
> > represent makes it much easier to understand why the algorithm does what
> > it does. I agree that this is not the most complicated case around, but
> > I thought this would be helpful for someone not already knowing my code.
> > OTOH I understand that also a high-level description is useful: would
> > you consider it satisfying to have both?
>
> Eh, maybe. I would at least start with the high-level explanation. When
> I read the above comment my eyes kind of glaze over, and it doesn't tell
> me anything that the code already doesn't (even the struct definitions
> alone tell me as much without checking how they're used.)

Same, really.

> >>> +            if (TRACE_ON())
> >>> +            {
> >>> +                char buf[32];
> >>> +                if (!node)
> >>> +                    sprintf(buf, "(nil)");
> >>
> >> Can that happen?
> >
> > Not yet, but it will as soon as loops and conditionals enter the scene,
> > because they can cause the last store to a variable to become unknown
> > (at compilation time).
>
> Ah, of course, that makes sense.
>
> >
> > Now, I know that usually we don't like to have dead code around. I think
> > this case could be accepted, since this code is just a line long and
> > becomes alive at 3/5. Also, I find it easier to review this piece of
> > code at once, instead of first reviewing a part, checking that the
> > missing case will never happen and then review the missing case and
> > checking that it fits well in the already-reviewed part.
> >
> > But if you really don't like it, I can defer the NULL case to 3/5.
>
> Yeah, even here I find it easier to review if this code is moved to 3/5.

Agreed, if it's not a big hassle I'd prefer the same.

> >
> >>> +                else if (node->index)
> >>> +                    sprintf(buf, "@%u", node->index);
> >>> +                else
> >>> +                    sprintf(buf, "%p", node);
> >>
> >> Can that happen?
> >
> > Definitely. Arguably, it might be the common case, given that the
> > earlier optimization passes (including this pass itself, given that it
> > might be executed more than once) might synthesize a lot of nodes, which
> > won't have an index until compute_liveness is executed.
> >
> Ah. I assumed that you would actually be indexing right before copyprop,
> and didn't check whether that was actually the case.
>
> Although on reflection now I'm not sure if we *should* index. We only
> dump the IR once, at the end, and I feel like we risk confusion by
> printing instruction indices that won't match. Hmm, maybe the code makes
> sense as it is...

FWIW I have a local patch dumping the initial IR and sometimes I add
more for intermediate steps while reviewing.
I'd be in favor of forcing reindexing in hlsl_dump_function() itself,
except that would be a bit of an unexpected side effect from calling a
debug function.

IIRC NIR does have some kind of "metadata lifetime management" thingie
where each transformation pass flags if some kind of metadata is not
valid anymore after it's done its job. The next pass will
automatically get the relevant metadata regenerated if it's going to
use it. Not saying that we necessarily need something like that but
let's keep the idea in mind.

> > (speaking of constant expressions in the code, I am a bit bothered that
> > evaluate_array_dimension is not able to figure out the value of constant
> > expressions that are not immediate constants, and unfortunately fixing
> > that is not as easy as running a fold_constants pass; I don't in mind a
> > clean solution for that that doesn't require re-implementing constant
> > folding there; that's another matter, of course)
>
> Yeah, that one is worrying...
>
> I think we might have to reimplement constant folding. Probably we can
> at least use a common helper, though.

Is it that hard? I guess I should try to do it, on the off-chance that
it isn't :D