[PATCH 4/5] d3d10/effect: Support stream output declaration when creating geometry shaders.
Matteo Bruni
matteo.mystral at gmail.com
Tue Aug 31 17:35:50 CDT 2021
On Tue, Aug 31, 2021 at 3:57 PM Nikolay Sivov <nsivov at codeweavers.com> wrote:
>
>
>
> On 8/31/21 4:32 PM, Matteo Bruni wrote:
> > On Mon, Aug 30, 2021 at 7:07 AM Nikolay Sivov <nsivov at codeweavers.com> wrote:
> >> Signed-off-by: Nikolay Sivov <nsivov at codeweavers.com>
> >> ---
> >> dlls/d3d10/effect.c | 172 ++++++++++++++++++++++++++++++++++++++++++--
> >> 1 file changed, 166 insertions(+), 6 deletions(-)
> > Patch is okay, a few notes below.
> >
> >> diff --git a/dlls/d3d10/effect.c b/dlls/d3d10/effect.c
> >> index 113125e9cbd..c3c98a0da31 100644
> >> --- a/dlls/d3d10/effect.c
> >> +++ b/dlls/d3d10/effect.c
> >> @@ -566,6 +592,127 @@ static HRESULT get_fx10_shader_resources(struct d3d10_effect_variable *v, const
> >> return S_OK;
> >> }
> >>
> >> +struct d3d10_effect_so_decl
> >> +{
> >> + D3D10_SO_DECLARATION_ENTRY *entries;
> >> + SIZE_T capacity;
> >> + SIZE_T count;
> >> + unsigned int stride;
> >> + char *decl;
> >> +};
> >> +
> >> +static void d3d10_effect_cleanup_so_decl(struct d3d10_effect_so_decl *so_decl)
> >> +{
> >> + heap_free(so_decl->entries);
> >> + heap_free(so_decl->decl);
> >> + memset(so_decl, 0, sizeof(*so_decl));
> >> +}
> >> +
> >> +static HRESULT d3d10_effect_parse_stream_output_declaration(const char *decl,
> >> + struct d3d10_effect_so_decl *so_decl)
> >> +{
> >> + static const char * allmask = "xyzw";
> > Usually in HLSL you can use "rgba" in place of "xyzw" for swizzles /
> > writemasks, it might be worth checking if those also work (and are
> > passed through to the effect bytecode unchanged).
> Yes, "rgba" also works apparently. Compiler preserves string as is, does
> not convert to coordinate mask names. I'll add a test for that.
> >
> >> + char *p, *ptr, *end, *next, *mask, *m, *slot;
> >> + unsigned int len = strlen(decl);
> >> + D3D10_SO_DECLARATION_ENTRY e;
> >> +
> >> + memset(so_decl, 0, sizeof(*so_decl));
> >> +
> >> + if (!(so_decl->decl = heap_alloc(len + 1)))
> >> + return E_OUTOFMEMORY;
> >> + memcpy(so_decl->decl, decl, len + 1);
> >> +
> >> + p = so_decl->decl;
> > I think it would be nice to have a small comment showing the expected
> > syntax of a stream out declaration (entry) somewhere around here. Just
> > something to quickly glance at while reading the following code.
> Okay.
> >
> >> +
> >> + while (p && *p)
> >> + {
> >> + memset(&e, 0, sizeof(e));
> >> +
> >> + end = strchr(p, ';');
> >> + next = end ? end + 1 : p + strlen(p);
> >> +
> >> + len = next - p;
> >> + if (end) len--;
> >> +
> >> + /* Remove leading and trailing spaces. */
> >> + while (len && isspace(*p)) { len--; p++; }
> >> + while (len && isspace(p[len - 1])) len--;
> >> +
> >> + p[len] = 0;
> >> +
> >> + /* Output slot */
> >> + if ((slot = strchr(p, ':')))
> >> + {
> >> + *slot = 0;
> >> +
> >> + ptr = p;
> >> + while (*ptr)
> >> + {
> >> + if (!isdigit(*ptr))
> >> + {
> >> + WARN("Invalid output slot %s.\n", debugstr_a(p));
> >> + goto failed;
> >> + }
> >> + ptr++;
> >> + }
> >> +
> >> + e.OutputSlot = atoi(p);
> >> + p = slot + 1;
> >> + }
> >> +
> >> + /* Mask */
> >> + if ((mask = strchr(p, '.')))
> >> + {
> >> + *mask = 0; mask++;
> >> +
> >> + if (!(m = strstr(allmask, mask)))
> >> + {
> >> + WARN("Invalid component mask %s.\n", debugstr_a(mask));
> >> + goto failed;
> >> + }
> >> +
> >> + e.StartComponent = m - allmask;
> >> + e.ComponentCount = strlen(mask);
> > Did you verify that it's illegal to have a wrongly ordered mask (e.g.
> > something like .wzyx)?
> > I expect that to be invalid but better to be sure.
> > Likewise for other assumptions involved here, like repeating the same
> > component twice.
> Yes, it is disallowed to break the order. API manifestation is that you
> specify start and count, which assumes it's always forward looking.
> Regarding error, it won't pass compiler check first, complaining about
> "invalid mask declaration". As far as I can tell that's the only thing
> that compiler checks in this string, you can have non-existent semantics
> as much as you like. That will break effect creation later though,
> presumably because shader object can't be created with invalid declaration.
Excellent. That should already do the right thing then.
> >
> > In general, some more tests with interesting stream out declarations
> > would be nice. I guess you have more planned after the test in patch
> > 5/5.
> I can think of adding more variations with spaces around, and maybe with
> multiple entries. Do you have other ideas?
Yes, that sounds good. I guess generally trying to exercise all the
edge cases of the parser.
I just noticed that there is probably one case that needs special
handling. From the MS docs:
'There is one special Semantic, labeled "$SKIP" which indicates an
empty semantic, leaving the corresponding memory in the stream out
buffer untouched. The $SKIP semantic cannot have a SemanticIndex, but
can have a Mask.'
It should probably be translated into a declaration entry with NULL
semantic name. See test_stream_output() in d3d10core/tests/d3d10core.c
for a few examples (both valid and not) if you haven't yet.
> >> + }
> >> + else
> >> + {
> >> + e.StartComponent = 0;
> >> + e.ComponentCount = 4;
> >> + }
> >> +
> >> + /* Semantic index and name */
> >> + len = strlen(p);
> >> + while (isdigit(p[len - 1]))
> >> + len--;
> >> +
> >> + if (p[len])
> >> + {
> >> + e.SemanticIndex = atoi(&p[len]);
> >> + p[len] = 0;
> >> + }
> >> +
> >> + e.SemanticName = p;
> >> +
> >> + if (!d3d_array_reserve((void **)&so_decl->entries, &so_decl->capacity, so_decl->count + 1,
> >> + sizeof(*so_decl->entries)))
> >> + goto failed;
> >> +
> >> + so_decl->entries[so_decl->count++] = e;
> >> +
> >> + if (e.OutputSlot == 0)
> >> + so_decl->stride += e.ComponentCount * sizeof(float);
> > This looks weird, but that's probably because the API for
> > ID3D10Device_CreateGeometryShaderWithStreamOutput() is confusing (just
> > one stride for up to 4 output buffers? What does it even mean?) and
> > neither the documentation nor our tests seem to clarify that. Any clue
> > what's supposed to happen with multiple output buffers?
> >
> > FWIW that function's signature makes a lot more sense in d3d11.
> My understanding is that in d3d10 you can specify stride only when you
> use one SO buffer. And if you use multiple buffers runtime will use
> default strides for all. For d3d11 you can set each explicitly.
Somehow I missed the relevant tests in d3d10core.c. It is still weird
but I guess not as mysterious now.
After playing with this a bit, it looks like it's actually more strict
than that: if there is just one buffer you need to pass a non-zero
stride; OTOH when there are multiple buffers you need to pass 0 as
stride. Please double check that it's the case (as I still don't trust
myself a lot on this) and consider sending a fix if necessary.
More information about the wine-devel
mailing list