[PATCH v2 1/2] wined3d: Use 3 component norm for 'nrm' opcode in GLSL backend.
Henri Verbeet
hverbeet at gmail.com
Wed Jun 12 11:23:07 CDT 2019
On Wed, 12 Jun 2019 at 15:33, Paul Gofman <gofmanp at gmail.com> wrote:
> - shader_addline(buffer, "tmp0.x = dot(%s, %s);\n",
> - src_param.param_str, src_param.param_str);
> + if (mask_size > 3)
> + shader_addline(buffer, "tmp0.x = dot(vec3(%s), vec3(%s));\n",
> + src_param.param_str, src_param.param_str);
> + else
> + shader_addline(buffer, "tmp0.x = dot(%s, %s);\n",
> + src_param.param_str, src_param.param_str);
This is fine.
> - if (mask_size > 1)
> + if (mask_size == 4)
> + {
> + static const float max_float = FLT_MAX;
> +
> + shader_addline(buffer, "tmp0.x == 0.0 ? vec4(vec3(0.0), sign(%s[3]) * ",
> + src_param.param_str);
> + shader_glsl_append_imm_vec(buffer, &max_float, 1, ins->ctx->gl_info);
> + shader_addline(buffer, ") : (%s * inversesqrt(tmp0.x)));\n", src_param.param_str);
> + }
> + else if (mask_size > 1)
This seems like a separate change. I'm also not sure about the FLT_MAX
literal. I'd expect that you could achieve the same test results by
simply multiplying the .w component with the rsq of tmp0.x. (Under
d3d9's "zero wins"-rules at least; there would be a potential NaN
under IEEE rules.)
More information about the wine-devel
mailing list