[Bug 17715] New: Incorrect translation of D3D asm instruction "expp"

Fri Mar 13 18:14:40 CDT 2009

http://bugs.winehq.org/show_bug.cgi?id=17715

           Summary: Incorrect translation of D3D asm instruction "expp"
           Product: Wine
           Version: 1.1.17
          Platform: PC-x86-64
        OS/Version: Linux
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: directx-d3d
        AssignedTo: wine-bugs at winehq.org
        ReportedBy: liquid.acid at gmx.net
                CC: stefandoesinger at gmx.at

Hi there,

currently wined3d (I only tested this in ARB mode, but it might also affect
GLSL mode) the D3D assembler instruction "expp" is incorrectly translated to
ARBvp assembler language.

wined3d takes this D3D asm line "expp r3.y, r3" and translates it into "EXP
R3.y, R3;". Well, this isn't working.

EXP is of scalarop type, the first parameter being of "masked destination
register" type (masking is used in this case above) and the second parameter
(here lies the problem) is of type "scalar source register".

Sadly R3 is a vector :(

Well, let's add there full shader source for completeness:
--------------------------------------------
vs_1_0
//D3DX8 Shader Assembler Version 0.91
mov r0, v0
add r1, r0, -c85
dp3 r1, r1, r1
rsq r1.x, r1.y
mov r3, r0
dp3 r3.w, r3, r3
rsq r3.w, r3.w
mul r7, r3, r3.w
mul r1.x, r1.x, r1.y
mad r2.x, r1.x, c86.x, c86.y
slt r4, r2.x, c0
mul r4, r4, -c83.w
max r2.x, r2.x, -r2.x
add r2.x, r4.z, r2.x
mul r3, r2.x, c83.y
expp r3.y, r3
mad r3, r3.y, c83.z, c83.w
mul r2, r3.x, r3.x
mul r3.y, r2.x, r3.x
mul r3.z, r2.x, r3.y
mul r3.w, r2.x, r3.z
dp4 r2, c82, r3
mul r2, r2, c86.z
mul r1.x, r1.x, c86.w
add r1.x, c1, -r1.x
max r1.x, c0, r1.x
mad r0.y, r2, r1.x, r0.y
dp4 oPos.x, r0, c2
dp4 oPos.y, r0, c3
dp4 oPos.z, r0, c4
dp4 oPos.w, r0, c5
mov oT0, v3
mov oT1, v3
mov oT2, v3
mov oT3, v3
mov oD0, c1
mov oFog.x, c0
--------------------------------------------

First of all, according to the current MSDN the instruction used above in the
source is NOT valid.

See this: http://msdn.microsoft.com/en-us/library/bb173373(VS.85).aspx

They explicitly state "expp dst, src.{x|y|z|w}" as the syntax.
They furthermore mention:
------------QUOTE---------------------------
src is a source register. Source register requires explicit use of replicate
swizzle, that is, exactly one of the .x, .y, .z, .w swizzle components (or the
.r, .g, .b, .a equivalents) must be specified.
------------UNQUOTE--------------------------

Well, this comment about the src reg doesn't seem to be valid at all....

Let's just take a look at the original D3D8 documentation ("D3DX8 Shader
Assembler Version 0.91" <- !!!).

To fully quote this:
------------------------------------------------------
expp

Provides exponential 2x partial support.

Syntax:
expp   vDest, vSrc0

Registers:
vDest: Destination register, holding the result of the operation.
vSrc0: Source register, specifying the input argument.

Operation:
The following code fragment shows the operations performed by the expp
instruction to write a result to the destination.
    SetDestReg();
    SetSrcReg(0);

    float w = m_Source[0].w;
    float v = (float)floor(m_Source[0].w);

    m_TmpReg.x = (float)pow(2, v);
    m_TmpReg.y = w - v;

    // Reduced precision exponent
    float tmp = (float)pow(2, w);
    DWORD tmpd = *(DWORD*)&tmp & 0xffffff00;

    m_TmpReg.z = *(float*)&tmpd;
    m_TmpReg.w = 1;

    WriteResult();

Remarks:
The expp instruction produces undefined results if fed a negative value for the
exponent.
This instruction provides exponential base 2 partial precision. It generates an
approximate answer in vDest.z and allows for a more accurate determination of
vDest.x*function(vDest.y), where function is a user approximation to 2*vDest.y
over the limited range (0.0 <= vDest.y < 1.0).
This instruction accepts a scalar source, and reduced precision arithmetic is
acceptable in evaluating vDest.z. However, the approximation error must be less
than 1/(211) the absolute error (10-bit precision) and over the range (0.0 <=
t.y < 1.0). Also, expp returns 1.0 in w.
The following example illustrates how the expp instruction might be used. 
expp r5, r0
------------------------------------------------------

So, the correct translation should be "EXP R3.y, R3.w;"

CCing Stefan Dösinger and Henri Verbeet.

I tried to patch this myself, but it looks like that "expp" has to be moved out
of shader_hw_map2gl to be able to do such an adjustement (but maybe not?).

-- 
Configure bugmail: http://bugs.winehq.org/userprefs.cgi?tab=email
Do not reply to this email, post in Bugzilla using the
above URL to reply.
------- You are receiving this mail because: -------
You are watching all bug changes.