TestBot and graphics tests

Roderick Colenbrander thunderbird2k at gmail.com
Thu Apr 25 09:53:18 CDT 2019


On Sat, Apr 20, 2019 at 11:35 PM Francois Gouget
<fgouget at codeweavers.com> wrote:
>
>
> Here are some things I've learned about PCI-passthrough recently, which
> would be one way (probably the best) to add "real hardware" to the
> TestBot.
>
> I don't want to give anyone false hopes though: this just went from
> "this is a mysterious thing I need to learn about" to "I think I know
> how to do it but have not tried it yet".
>
> So graphics card PCI-passthrough is now relatively well documented on
> the Internet and seems to have seen some use-cases that would indicate
> it may even be reasonably usable.
>
> * There are two machines intended to run real GPU tests for Wine:
>   cw1-hd6800 and cw2-gtx560. For now they are only used to run WineTest
>   daily on Windows 8.1, Windows 10 1507, 1709, 1809 and Linux. That's
>   quite a bunch but it would be much better if they were integrated with
>   the TestBot as that would allow developers to submit their own tests.
>   So I had a look at what it would imply to convert them to VM hosts
>   using QEmu + PCI-passthrough.
>
> * First one needs a processor with hardware virtualisation support. For
>   Intel that's VT-d. Both machines have an Intel Core 2600 which
>   supports VT-d. Good.
>
> * Second the motherboard too needs to support VT-d. Both machines have
>   an ASRock P67 Extreme4 motherboard. Unfortunately UEFI says
>   "unsupported" next to the "VT-d" setting for the motherboard :-( It
>   looks like there was some confusion as to whether the P67 chipset
>   supported VT-d initially. From what I gathered it's only Q67 that does
>   but this caused some manufacturers, among which ASRock, to initially
>   claim support and later retract it.

>From memory this Asrock board likely works okay. Back in the days we
were a very early adopter of Vt-d/iommu working closely with Intel /
Nvidia. Especially first gen i7 motherboards were very, very buggy
with vt-d. Often not supporting it or else advertising support and
having bad bugs preventing it from working. I had to test dozens of
motherboards. Asrock at that time generally worked.

> * Then one needs to add the intel_iommu=on option to the kernel command
>   line (resp. amd_iommu). This is should make all the PCI devices appear
>   in /sys/kernel/iommu_groups. But that folder remains empty which
>   confirms that full VT-d support is missing.
>
> * Another important aspect is to have a graphics card which is
>   hot-restartable. In some cases when a VM's graphics card is crashed
>   the only way to reset it is to reboot the host. The TestBot is likely
>   to crash the graphics card, particularly if we do a hard-power off on
>   the VMs like we currently do, and it would relaly be annoying to have
>   to reboot the host everytime the graphics card goes belly up.
>   I don't know if the AMD HD6800 and Nvidia GTX560 are suitable but it's
>   quite possible they are not. All I know for now is that we should
>   avoid AMD's R9 line of graphics cards. I still need to find a couple
>   of suitable reasonably lower power graphics cards: one AMD and one
>   Nvidia.

AMD generally works fine. Nvidia well, let's just say they are not
nice and purpose work against virtualization. The driver has an
if-statement blocking non-professional cards. There are workarounds,
but it is a cat and mouse game. Don't bother with these. Just get a
"cheap" Quadro P1000 / P2000 card and avoid the hassles. (I do have
some special Nvidia virtualization capable hardware left, but it is
dated by now. I think I have some special Geforce 460 / 480 / 560 and
a Tesla model. If needed I could share some)

For an AMD card, the main hassles are that some of them have "PCIe
reset" issues, which may prevent a VM from booting the card. AMD is
not like Nvidia trying to block virtualization on consumer cards.
Their Radeon Pro cards can sometimes be a little better. A cheap
Radeon Pro WX2100 is for example a fine card.

> * Then one needs to prevent the host from using the graphics card.
>   Usually that's done by having the host use the processor's IGP and
>   dedicating the discrete GPU to the VMs. Unfortunately the 2600's IGP
>   cannot be active when there's a discrete card so that route is denied
>   to us. Fortunately there's quite a bit of documentation on how to shut
>   down not just X but also the Linux virtual consoles to free the GPU
>   and hand it over to the VMs after boot.
>   Doing so means losing KVM access to the host which is a bit annoying
>   in case something goes wrong. So ideally we'd make sure this does not
>   happen in grub's "safe mode" boot option.

More for future boxes, I'm not sure if you have physical access to
these boxes or how they are maintained. If you ever upgrade to a new
spec, if you can I would almost go for systems with IPMI, though not
that common on consumer boards you often need more a workstation
board. Benefits are you can remotely manage the systems (power on/off,
serial console, VGA...) and you also have a dumb VGA you can use. Of
course a cheap card can work too, but you may like remote management
and be able to just put a "farm" somewhere in a corner without
keyboard and monitor.

> * Although I have not done any test yet I'm reasonably certain that
>   PCI-passthrough rules out live snapshots: QEmu would have no way to
>   restore the graphics card's internal state.

Correct, hardware state is an issue. (For professional uses Nvidia
provides such feature). One workaround, which kind of worked at the
time was to enter sleep mode in which drivers need to handle some
state recovery. It worked for games, but probably not worth the effort
at all.

>
>   - For Windows VMs that's not an issue: if we provide a power off
>     snapshot the TestBot already knows how to power on the VM and wait
>     for it to boot (as long as the boot is shorter than the connection
>     timeout but it works out usually).
>
>   - For Linux VM's that's more of an issue: the TestBot will power on
>     the VM as usual. The problem is when it updates Wine: after
>     recompiling everything it deletes the old snapshot and creates a new
>     one from the current state of the VM, which means a live snapshot.
>     So the TestBot will need to be modified so it knows when and how to
>     power off the VM and take a powered off snapshot.
>
> * Since the VM has full control of the graphics card QEmu has no access
>   to the content of the screen. That's not an issue for the normal
>   TestBot operation, just for the initial VM setup. Fortunately the
>   graphics card is connected to a KVM so the screen can be accessed
>   through that means. It does mean assigning the mouse and keyboard to
>   the VM too. Should that prove impractical there are a bunch of other
>   options too: VNC, LookingGlass, Synergy, etc. But the less needs to be
>   installed in the VMs the better.
>
> * Also the TestBot uses QEmu to take the screenshots. But QEmu does not
>   have access to the content of the screen. The fix is to use a tool to
>   take the screenshots from within the VM and use TestAgent to retrieve
>   them. On Linux there are standard tools we can use. On Windows there's
>   code floating around we can use.
>
>
> So the next steps would be:
> * Maybe test on my box using the builtin IGP.
>   But that likely won't be very conclusive beyond confirming the
>   snapshot issues, screen access, etc.
> * Find a suitable AMD or Nvidia graphics card and test that on my box.
>   That would allow me to fully test integration with the TestBot, check
>   for stability issues, etc.
> * Then see what can be done with the existing cw1 and cw2 boxes.
>

Overall pcie passthrough is definitely the way to go. We have used in
a huge capacity for years and it works very well. I would recommend
using it here too.

Thanks,
Roderick



More information about the wine-devel mailing list