[Bug 28422] New: scanf family of functions provides only 7 digits of precision for converting doubles and long doubles

Sat Sep 17 18:34:22 CDT 2011

http://bugs.winehq.org/show_bug.cgi?id=28422

           Summary: scanf family of functions provides only 7 digits of
                    precision for converting doubles and long doubles
           Product: Wine
           Version: 1.3.26
          Platform: x86
        OS/Version: Linux
            Status: UNCONFIRMED
          Severity: critical
          Priority: P2
         Component: msvcrt
        AssignedTo: wine-bugs at winehq.org
        ReportedBy: irwin at beluga.phys.uvic.ca

Created an attachment (id=36434)
 --> (http://bugs.winehq.org/attachment.cgi?id=36434)
Patch to greatly reduce numerical noise in scanf conversion of doubles and long
doubles

If I compile the test_scanf.c code attached below using "gcc test_scanf.c"
under MinGW/MSYS  on wine-1.3.26, I get the following results from running the
a.exe executable that is created by that build under Wine.

bash.exe-3.1$ echo "1.1e-30" |./a.exe
1.1e-30 is input string
 1.1000004917384256455392e-030 = 3f9b64f8772f16505258 is long double value
 1.1000004917384256198806e-030 =  39b64f8772f16505 is double value
     1.1000004917384e-030: 13:  1.1000009834770454030908e-030 =
39b64f881a45deaf
    1.10000049173843e-030: 14:  1.1000009834770755310078e-030 =
39b64f881a45df5b
   1.100000491738426e-030: 15:  1.1000009834770715022747e-030 =
39b64f881a45df44
  1.1000004917384256e-030: 16:  1.1000009834770711519501e-030 =
39b64f881a45df42
 1.10000049173842562e-030: 17:  1.1000009834770711519501e-030 =
39b64f881a45df42

I annotate the above lines of output to demonstrate what the test application
does.

1. Read a string from stdin (in this case 1.1e-30) and output the result.  

2. Transform that string to a long-double value (80-bit floating point) using
sscanf with a format string of "%Le" and output that result in decimal and
hexadecimal. The decimal result (in this case 1.1000004917384256455392e-030)
immediately demonstrates the bad numerical significance loss in sscanf since
the answer has a relative error of ~5e-7 rather than the expected relative
error of order ~1.e-20.  The 64-bit mantissa in the hexadecimal representation
of the long double is shifted to the left _on output_ by one bit to simulate
the hidden bit that occurs for double values to make hex comparisons with the
corresponding double in line 3 easier.

3. Transform the string to a double value (64-bit floating point) using sscanf
with a format string of "%le" and output the result in both decimal and
(unshifted) hexadecimal form.  The decimal form (of
1.1000004917384256198806e-030) demonstrates a relative error of 5.e-7 rather
than the expected relative error of order of 1.e-16.

4-8.  Use sprintf to write the double-precision form of the number to a
character string in rounded form using a precision of "i" where i ranges from
13 to 17. Transform that string to a double value (64-bit floating point) using
sscanf with a format string of "%le" and output the rounded string, the value
of i, and the double result in both decimal and (unshifted) hexadecimal form. 
This test indicates how much precision is required for the conversion from
double to a rounded string in order to read back a double that is the same as
the original double.  In this case at a precision of 16 and beyond sscanf
provides consistent results, but they are not the same as the original double
because of the large numerical noise in the results from sscanf.

I then applied a patch (attached) to wine-1.3.26/dlls/msvcrt/scanf.h that fixes
this numerical precision issue for conversions of doubles and long doubles by
the scanf family of functions.  (Note, wine-1.3.26/dlls/msvcrt/scanf.h and
wine-1.3.28/dlls/msvcrt/scanf.h are identical so this patch should apply for
wine-1.3.28 as well.)  What the patch does is make sure all calculations during
the conversion are done in long double precision with results scaled in such a
way that all calculations except a possible multiplication or division at the
last are done with integers stored in long double form.  These changes assure
exact results for input numbers between 0 and 2^65 - 1 that can be represented
exactly in long double form, and results with minimal numerical noise in other
cases.  Here are the same test results for this patched test case:

bash.exe-3.1$ echo "1.1e-30" |./a.exe
1.1e-30 is input string
 1.0999999999999999999835e-030 = 3f9b64f86cb9cefaf7a0 is long double value
 1.0999999999999999165078e-030 =  39b64f86cb9cefaf is double value
     1.1000000000000e-030: 13:  1.0999999999999999165078e-030 =
39b64f86cb9cefaf
    1.10000000000000e-030: 14:  1.0999999999999999165078e-030 =
39b64f86cb9cefaf
   1.100000000000000e-030: 15:  1.0999999999999999165078e-030 =
39b64f86cb9cefaf
  1.0999999999999999e-030: 16:  1.0999999999999999165078e-030 =
39b64f86cb9cefaf
 1.09999999999999992e-030: 17:  1.0999999999999999165078e-030 =
39b64f86cb9cefaf

The results are vastly improved by the attached patch for the scanf family of
functions.  Now the relative numerical error for "%Le" conversion to long
double is reduced from 5.e-7 to 2.e-20 and the relative numerical error for
"%le" conversion to double is reduced from 5.e-7 to ~1.e-16.  Furthermore, the
round trip test of converting  a double to rounded string form and then back to
double shows exact agreement. (For other tests with repeating decimal input
such as 1.111111111111111111111111111e-30, exact agreement was obtained for a
rounded precision of 16.)

Just as a matter of interest, here are the results for the same test
application (compiled with gcc for Linux) on Linux:

software at raven> echo "1.1e-30" |./a.out
1.1e-30 is input string
  1.0999999999999999999835e-30 = 3f9b64f86cb9cefaf7a0 is long double value
  1.0999999999999999165078e-30 =  39b64f86cb9cefaf is double value
      1.1000000000000e-30: 13:   1.0999999999999999165078e-30 =
39b64f86cb9cefaf
     1.10000000000000e-30: 14:   1.0999999999999999165078e-30 =
39b64f86cb9cefaf
    1.100000000000000e-30: 15:   1.0999999999999999165078e-30 =
39b64f86cb9cefaf
   1.0999999999999999e-30: 16:   1.0999999999999999165078e-30 =
39b64f86cb9cefaf
  1.09999999999999992e-30: 17:   1.0999999999999999165078e-30 =
39b64f86cb9cefaf 
which is identical with the patched wine result except for the 2-digit
exponents.

When I created JPL binary ephemerides (consisting of roughly 1.5GB of doubles)
from JPL ascii ephemerides with my ephcom-2.0.2 software on Linux and on
Wine-1.3.26  built with the attached patch, I got exact agreement between the
double data for the Linux- and Wine-produced binary ephemerides in most cases. 
However, 0.04 per cent of the time the double values on the two platforms
differed at the 1.e-16 relative difference level which is consistent with scanf
errors in the two cases differing on average by one in the last bit of the long
double representation.  By chance such small differences would propagate to the
double representation (with 11 bits less in the mantissa when counting the
hidden bit) roughly 0.04 per cent of the time. So I feel this patched scanf
family of functions does very well against the Linux equivalent.

I have classified this apparently long-standing bug after a lot of thought as
"critical".   The reason for that classification is the scanf family is a
fundamental building block for any platform and the numerical precision of the
scanf family for double values is critical to a lot of applications (such as
ephcom-2.0.2 where I first discovered the issue).  One could argue that a
workaround is available (Dan Kegel noted this on the wine-devel list) of using
Microsoft's version of msvcrt.dll rather than the Wine version.  So this is not
critical because of the availability of that alternative msvcrt.dll.  However,
there is a chicken/egg problem here.  Presumably some of those apps that use
the Microsoft version of msvcrt.dll do so because the scanf numerical precision
bug in Wine's version is giving them trouble.  Anyhow, applying this patch
going forward is fundamental to Wine as an independent platform so that is why
I chose to use a "critical" classification as a first approximation subject, of
course, to any reclassification Wine developers want to make for this bug.

-- 
Configure bugmail: http://bugs.winehq.org/userprefs.cgi?tab=email
Do not reply to this email, post in Bugzilla using the
above URL to reply.
------- You are receiving this mail because: -------
You are watching all bug changes.