[PATCH v2 1/2] wined3d: Use 3 component norm for 'nrm' opcode in GLSL backend.

Wed Jun 12 11:23:07 CDT 2019

On Wed, 12 Jun 2019 at 15:33, Paul Gofman <gofmanp at gmail.com> wrote:
> -    shader_addline(buffer, "tmp0.x = dot(%s, %s);\n",
> -            src_param.param_str, src_param.param_str);
> +    if (mask_size > 3)
> +        shader_addline(buffer, "tmp0.x = dot(vec3(%s), vec3(%s));\n",
> +                src_param.param_str, src_param.param_str);
> +    else
> +        shader_addline(buffer, "tmp0.x = dot(%s, %s);\n",
> +                src_param.param_str, src_param.param_str);
This is fine.

> -    if (mask_size > 1)
> +    if (mask_size == 4)
> +    {
> +        static const float max_float = FLT_MAX;
> +
> +        shader_addline(buffer, "tmp0.x == 0.0 ? vec4(vec3(0.0), sign(%s[3]) * ",
> +                src_param.param_str);
> +        shader_glsl_append_imm_vec(buffer, &max_float, 1, ins->ctx->gl_info);
> +        shader_addline(buffer, ") : (%s * inversesqrt(tmp0.x)));\n", src_param.param_str);
> +    }
> +    else if (mask_size > 1)
This seems like a separate change. I'm also not sure about the FLT_MAX
literal. I'd expect that you could achieve the same test results by
simply multiplying the .w component with the rsq of tmp0.x. (Under
d3d9's "zero wins"-rules at least; there would be a potential NaN
under IEEE rules.)