RFC How to get rid of "always new" TestBot false positives?

Fri Mar 27 12:23:20 CDT 2020

On 3/27/20 5:13 PM, Francois Gouget wrote:
> On Fri, 27 Mar 2020, Rémi Bernon wrote:
> [...]
>> IIUC the fix also involves updating the failing tests message to mark some
>> information as irrelevant to the failure itself. In this case, why not just
>> remove that information from the message?
> 
> Sometimes it's needed to diagnose the failure.
> 
> 
>> Because the test failure messages are used to identify individual tests and
>> track their results in time they should probably only include meaningful
>> information and if some additional information is still useful for debugging,
>> add a trace right before the test to print it.
> 
> That makes the tests more complicated: lots of ok() calls will need an
> associate trace() which will either push the WineTest reports above the
> 1.5 MB limit, or will require only tracing the value if the test fails:
> 
>       else
>           ok(!strcmp(buff, time), "Expected %s, got %s\n", time, buff);
> 
> ->   else
>       {
>           ok(!strcmp(buff, time), "Unexpected LOCALE_USER_DEFAULT date / format\n");
>           trace("Expected %s, got %s\n", time, buff);
>       }
> 
> or   else if (strcmp(buff, time))
>       {
>           ok(0, "Unexpected LOCALE_USER_DEFAULT date / format\n");
>           trace("Expected %s, got %s\n", time, buff);
>       }
> 
> I guess one could cheat by putting a linefeed before the variable parts.
> 
>       else
>           ok(!strcmp(buff, time), "Unexpected date / format\nExpected %s, got %s\n", time, buff);
> 
> It's kind of ugly and requires formulating the failure message in a
> specific way but it would work I guess.
> 
> 
> [...]
>> Maybe using the surrounding function name and the test line number relative to
>> it would be a good unique identifier instead. I think it would reduce the
>> possible false positives to when tests within a function are added or
>> reordered, which would be a good occasion to fix the failing tests.
> 
> This has some drawbacks:
> * It makes it hard for developers to jump to the right line based on
>    the failure message.
> 
> * It would be even harder for test.winehq.org to link to the right line
>    in the source.

You could also include the file and absolute line numbers as well, this 
was just to use as a unique test identifier instead of its message.

> 
> * The relative line numbers would change precisely when the test
>    function is patched, which is when we would like to avoid false
>    positives.

I thought most of the time the test failures were unrelated to the 
changes (well the test file may be modified, but not necessarily the 
function where the failure happens). Adding new tests after the existing 
ones within a function is also a possible mitigation to prevent changing 
ids.

> 
>> That would require some wrapper around test functions as well to track
>> relative line numbers, but it's maybe not as bad as going over every test
>> message to check and remove irrelevant information.
>>
>> A quick and dirty PoC: https://gcc.godbolt.org/z/3TELVD
>>
>> (I haven't thought a lot about it and there's maybe some obvious traps I
>> missed.)
> 
> We have cases where test_other_function() calls test_some_function() so
> declare_winetest() would need to handle nesting. And some functions, but
> not all, take a line number as a parameter. See for instance
> check_unicode_string_() in advapi32:lsa. I suspect the combination of
> both cases could be tricky to handle.
> 

Yeah of course it's a bit more tricky, not unfeasible though.

Another idea would be to use the stringified condition -combined with 
the surrounding context, as it's currently the case- as a test 
identifier instead of its message.
-- 
Rémi Bernon <rbernon at codeweavers.com>