Is the test.winehq.org front page too pessimistic?

Thu Feb 12 01:28:09 CST 2009

Juan Lang wrote:
> The front page of test.winehq.org shows statistics about failed tests,
> but it doesn't seem to take into account the number of individual
> tests that passed and failed, rather the number of files that had any
> failures.
> 
> So, for example, about a week ago I got a fix committed for some
> failing mapi32 tests.  Looking at the machines with test failures,
> before the fix was committed, 139 tests were run, with 134 of them
> failing, whereas after the fix was committed, the same number of tests
> were run, with only 6 of them failing.  Nonetheless, the 4th of
> February shows a higher failure rate (14.6%) than the 3rd of February
> (12.4%).
> 
> I know other tests could have started failing in the interim, but it
> seems like we've been putting a fair amount of effort into reducing
> test failures lately, while the percent of failed tests isn't going
> down, at least not on the main page.  If you look at a particular
> day's results, the numbers look a bit better over time.
> 
> I'm not sending a patch, because there may be different opinions on
> this.  That is, perhaps some people like to see a statistic on the
> number of files with failing tests on any machine, which the front
> page appears to show, while others may like to see the number of
> failures in a particular file, which a day's results show.  My own
> opinion is that it's hard to get motivated to fix something without
> some sort of positive feedback for it, so changing the front page
> would be better.
> 
> My own feeling is that there are far fewer failing tests now than
> there used to be, and I'd sure like to see that reflected somewhere at
> a quick glance.  Thoughts?
> --Juan
> 
> 
> 
I don't think that showing individual tests (the actual counts inside dll:name) 
will help as the error rate will be marginal (as pointed out by AJ).

If you look at the main page you will see a number for Win95 for example. This 
number shows you how many dll:name tests had a failure on one or more Win95 
boxes. This means that 1 box can mess up the platforms stats quite badly. If I 
have 10 Win95 boxes with no failures and one with all dll:name tests failing, 
the failure rate for that platform would be equal to the total number of tests 
(dll:name again).

The cumulative number 'Failures' however is a differently calculated number. 
It's just an adding of 'overall platform failures' for each platform divided by 
'different platforms on that line on the main page' x 'total number of unique 
dll:name tests'.

So maybe (and it's been discussed in the past) the 'Failures' number should be 
the number of unique tests (dll:name) that fail on one or more boxes (just like 
the platform ones but than overall).

That way we get an indication how many dll:name tests need some fixing. It will 
however won't do any good to our figures. My 6 boxes for example show 5.0% 
failure on the main page but using this other approach it would have been 14.7%.

So I don't think our numbers are too pessimistic. We are bitten by the fact that 
more and more people run winetest and that raises the possibility of failing 
tests of course (non-administrator, different locales, no C-drive .....).

-- 
Cheers,

Paul.