[PATCH] dnsapi: Add DnsGetCacheDataTable stub
Francois Gouget
fgouget at codeweavers.com
Sat Aug 31 04:47:42 CDT 2019
On Fri, 30 Aug 2019, RĂ©mi Bernon wrote:
> On 8/30/19 3:03 PM, Marvin wrote:
> > Hi,
> >
> > While running your changed tests, I think I found new failures.
> > Being a bot and all I'm not very good at pattern recognition, so I might be
> > wrong, but could you please double-check?
> >
> > Full results can be found at:
> > https://testbot.winehq.org/JobDetails.pl?Key=56052
> >
> > Your paranoid android.
> >
> >
> > === build (build log) ===
> >
> > Task errors:
> > BotError: The VM is not powered on
> >
>
> I did a successful run with the same patch here:
> https://testbot.winehq.org/JobDetails.pl?Key=56051
Yes, here's what happened:
* When it has nothing to do the TestBot picks some VMs that it starts up
in advance in the hope they will be needed by the next job.
* Because the build VM is used to provide the Windows binaries for
testing on Windows it's needed by almost every job. So its given a
high priority and ends up being prepared in advance and thus is
recorded by the TestBot as being in the idle state.
* But then there was a power outage so all the VMs got powered off.
* But the TestBot server is on a separate location and was not powered
off so it was not aware that the VMs got powered off. The thing is
these days the Engine never uses libvirt because these calls are
blocking which means if it tries to communicate with a dead VM host of
one where libvirt is hosed, these calls can block for a long time (up
to 10 minutes), which would block the Engine for all that time.
Instead it assumes the information it has in its database about the VM
is accurate and forks a process whenever it needs to perform an
operation on a VM, whether that's running a task, shutting it down or
reverting it.
* So it just scheduled the taks on the build VM as usual. But the
child process could not communicate with the VMs, checked its state
and complained that there was an error because "The VM is not
powered on".
What's wrong is that it marked the task as failed. A better recovery
mechanism would have been to either mark the VM as "dirty" or "offline"
and put the task back in the queued state so the TesBot tries running it
again.
The risk is that if the reason why the VM is not usable is not caused by
an external factor (such as here), the next round is likely to produce
the same result, leading the TestBot to try to run the same highest
priority task again and again on the one borked VM.
Finally the reason why you won't see that job as failed if you look a it
now is because I restarted it. The user who submitted a job that failed
due to a TestBot error gets a button to restart it. A user can only
restart his own jobs and I'm not sure it that would have been possible
in this case since the job came from a wine-devel email (but the
administrator gets to restart anyone's jobs ;-).
Anyway I'll see about tweaking the task scripts to avoid this situation
in the future.
--
Francois Gouget <fgouget at codeweavers.com>
More information about the wine-devel
mailing list