Here are some things I've learned about PCI-passthrough recently, which would be one way (probably the best) to add "real hardware" to the TestBot.
I don't want to give anyone false hopes though: this just went from "this is a mysterious thing I need to learn about" to "I think I know how to do it but have not tried it yet".
So graphics card PCI-passthrough is now relatively well documented on the Internet and seems to have seen some use-cases that would indicate it may even be reasonably usable.
* There are two machines intended to run real GPU tests for Wine: cw1-hd6800 and cw2-gtx560. For now they are only used to run WineTest daily on Windows 8.1, Windows 10 1507, 1709, 1809 and Linux. That's quite a bunch but it would be much better if they were integrated with the TestBot as that would allow developers to submit their own tests. So I had a look at what it would imply to convert them to VM hosts using QEmu + PCI-passthrough.
* First one needs a processor with hardware virtualisation support. For Intel that's VT-d. Both machines have an Intel Core 2600 which supports VT-d. Good.
* Second the motherboard too needs to support VT-d. Both machines have an ASRock P67 Extreme4 motherboard. Unfortunately UEFI says "unsupported" next to the "VT-d" setting for the motherboard :-( It looks like there was some confusion as to whether the P67 chipset supported VT-d initially. From what I gathered it's only Q67 that does but this caused some manufacturers, among which ASRock, to initially claim support and later retract it.
* Then one needs to add the intel_iommu=on option to the kernel command line (resp. amd_iommu). This is should make all the PCI devices appear in /sys/kernel/iommu_groups. But that folder remains empty which confirms that full VT-d support is missing.
* Another important aspect is to have a graphics card which is hot-restartable. In some cases when a VM's graphics card is crashed the only way to reset it is to reboot the host. The TestBot is likely to crash the graphics card, particularly if we do a hard-power off on the VMs like we currently do, and it would relaly be annoying to have to reboot the host everytime the graphics card goes belly up. I don't know if the AMD HD6800 and Nvidia GTX560 are suitable but it's quite possible they are not. All I know for now is that we should avoid AMD's R9 line of graphics cards. I still need to find a couple of suitable reasonably lower power graphics cards: one AMD and one Nvidia.
* Then one needs to prevent the host from using the graphics card. Usually that's done by having the host use the processor's IGP and dedicating the discrete GPU to the VMs. Unfortunately the 2600's IGP cannot be active when there's a discrete card so that route is denied to us. Fortunately there's quite a bit of documentation on how to shut down not just X but also the Linux virtual consoles to free the GPU and hand it over to the VMs after boot. Doing so means losing KVM access to the host which is a bit annoying in case something goes wrong. So ideally we'd make sure this does not happen in grub's "safe mode" boot option.
* Although I have not done any test yet I'm reasonably certain that PCI-passthrough rules out live snapshots: QEmu would have no way to restore the graphics card's internal state.
- For Windows VMs that's not an issue: if we provide a power off snapshot the TestBot already knows how to power on the VM and wait for it to boot (as long as the boot is shorter than the connection timeout but it works out usually).
- For Linux VM's that's more of an issue: the TestBot will power on the VM as usual. The problem is when it updates Wine: after recompiling everything it deletes the old snapshot and creates a new one from the current state of the VM, which means a live snapshot. So the TestBot will need to be modified so it knows when and how to power off the VM and take a powered off snapshot.
* Since the VM has full control of the graphics card QEmu has no access to the content of the screen. That's not an issue for the normal TestBot operation, just for the initial VM setup. Fortunately the graphics card is connected to a KVM so the screen can be accessed through that means. It does mean assigning the mouse and keyboard to the VM too. Should that prove impractical there are a bunch of other options too: VNC, LookingGlass, Synergy, etc. But the less needs to be installed in the VMs the better.
* Also the TestBot uses QEmu to take the screenshots. But QEmu does not have access to the content of the screen. The fix is to use a tool to take the screenshots from within the VM and use TestAgent to retrieve them. On Linux there are standard tools we can use. On Windows there's code floating around we can use.
So the next steps would be: * Maybe test on my box using the builtin IGP. But that likely won't be very conclusive beyond confirming the snapshot issues, screen access, etc. * Find a suitable AMD or Nvidia graphics card and test that on my box. That would allow me to fully test integration with the TestBot, check for stability issues, etc. * Then see what can be done with the existing cw1 and cw2 boxes.
Virgl is an alternative to PCI passthrough. To summarize it's a virtualized graphics card that passes the OpenGL commands to one of the host's graphics card to render into the VM's framebuffer.
Advantages: * It does not require the same level of hardware support as PCI-passthrough. As long as QEmu works you can have Virgl support. * You can also have multiple VMs all using the host's graphics card at the same time while still getting hardware accelerated OpenGL support in the VMs.
Drawbacks: * There's more overhead than for PCI-passthrough but that would not be a significant issue for WineTest. * It introduces a middle layer and thus the potential for new and interesting bugs. * Vulkan is not supported (there's work on basing it on Vulkan but that looks like many months away).
Anyway, here's what I learned when I investigated whether Virgl could be useful for WineTest.
* First Virgl currently does not support remote access: the client (e.g. virt-manager) must run on the same machine as the VM. That's in the process of being fixed. But in the meantime this issue can be worked around by using the KVM.
* Currently Virgl does not really support Windows guests. It looks like there is work on this but I think it's going to be some time before it is usable. https://lwn.net/Articles/767970/
* It's also impossible to take snapshots of live VMs. It's not entirely clear whether this is a bug or made impossible because of the host-side state. In any case it means that for now Virgl is no better than PCI-passthrough in this respect and will require the same TestBot changes.
* Debian 9 (stable) does not have the needed libvirglrenderer0 package so Virgl support is missing from QEmu. Installing Debian Testing's QEmu packages has wide-ranging cascading impacts so for now Virgl cannot be tested on the TestBot machines :-( Between this and Mesa maybe the TestBot machines should be upgraded to Debian Testing.
* I did test Virgl on my box. WineTest ran to completion but then failed to upload the report dues to some Wine-level thread deadlock! That's a bit suspicious. I manually completed the upload and the results can be seen in the '-virgl' results. They're not terrible but it's not clear that Virgl on Intel gives better results that plain QXL.
https://test.winehq.org/data/f9301c2b66450a1cdd986e9052fcaa76535ba8b7/linux_...
On Sun, 21 Apr 2019 at 11:05, Francois Gouget [email protected] wrote:
- Then one needs to prevent the host from using the graphics card. Usually that's done by having the host use the processor's IGP and dedicating the discrete GPU to the VMs. Unfortunately the 2600's IGP cannot be active when there's a discrete card so that route is denied to us. Fortunately there's quite a bit of documentation on how to shut down not just X but also the Linux virtual consoles to free the GPU and hand it over to the VMs after boot. Doing so means losing KVM access to the host which is a bit annoying in case something goes wrong. So ideally we'd make sure this does not happen in grub's "safe mode" boot option.
Another option would be to add an inexpensive card just for that purpose.
On Sat, Apr 20, 2019 at 11:35 PM Francois Gouget [email protected] wrote:
Here are some things I've learned about PCI-passthrough recently, which would be one way (probably the best) to add "real hardware" to the TestBot.
I don't want to give anyone false hopes though: this just went from "this is a mysterious thing I need to learn about" to "I think I know how to do it but have not tried it yet".
So graphics card PCI-passthrough is now relatively well documented on the Internet and seems to have seen some use-cases that would indicate it may even be reasonably usable.
There are two machines intended to run real GPU tests for Wine: cw1-hd6800 and cw2-gtx560. For now they are only used to run WineTest daily on Windows 8.1, Windows 10 1507, 1709, 1809 and Linux. That's quite a bunch but it would be much better if they were integrated with the TestBot as that would allow developers to submit their own tests. So I had a look at what it would imply to convert them to VM hosts using QEmu + PCI-passthrough.
First one needs a processor with hardware virtualisation support. For Intel that's VT-d. Both machines have an Intel Core 2600 which supports VT-d. Good.
Second the motherboard too needs to support VT-d. Both machines have an ASRock P67 Extreme4 motherboard. Unfortunately UEFI says "unsupported" next to the "VT-d" setting for the motherboard :-( It looks like there was some confusion as to whether the P67 chipset supported VT-d initially. From what I gathered it's only Q67 that does but this caused some manufacturers, among which ASRock, to initially claim support and later retract it.
From memory this Asrock board likely works okay. Back in the days we
were a very early adopter of Vt-d/iommu working closely with Intel / Nvidia. Especially first gen i7 motherboards were very, very buggy with vt-d. Often not supporting it or else advertising support and having bad bugs preventing it from working. I had to test dozens of motherboards. Asrock at that time generally worked.
Then one needs to add the intel_iommu=on option to the kernel command line (resp. amd_iommu). This is should make all the PCI devices appear in /sys/kernel/iommu_groups. But that folder remains empty which confirms that full VT-d support is missing.
Another important aspect is to have a graphics card which is hot-restartable. In some cases when a VM's graphics card is crashed the only way to reset it is to reboot the host. The TestBot is likely to crash the graphics card, particularly if we do a hard-power off on the VMs like we currently do, and it would relaly be annoying to have to reboot the host everytime the graphics card goes belly up. I don't know if the AMD HD6800 and Nvidia GTX560 are suitable but it's quite possible they are not. All I know for now is that we should avoid AMD's R9 line of graphics cards. I still need to find a couple of suitable reasonably lower power graphics cards: one AMD and one Nvidia.
AMD generally works fine. Nvidia well, let's just say they are not nice and purpose work against virtualization. The driver has an if-statement blocking non-professional cards. There are workarounds, but it is a cat and mouse game. Don't bother with these. Just get a "cheap" Quadro P1000 / P2000 card and avoid the hassles. (I do have some special Nvidia virtualization capable hardware left, but it is dated by now. I think I have some special Geforce 460 / 480 / 560 and a Tesla model. If needed I could share some)
For an AMD card, the main hassles are that some of them have "PCIe reset" issues, which may prevent a VM from booting the card. AMD is not like Nvidia trying to block virtualization on consumer cards. Their Radeon Pro cards can sometimes be a little better. A cheap Radeon Pro WX2100 is for example a fine card.
- Then one needs to prevent the host from using the graphics card. Usually that's done by having the host use the processor's IGP and dedicating the discrete GPU to the VMs. Unfortunately the 2600's IGP cannot be active when there's a discrete card so that route is denied to us. Fortunately there's quite a bit of documentation on how to shut down not just X but also the Linux virtual consoles to free the GPU and hand it over to the VMs after boot. Doing so means losing KVM access to the host which is a bit annoying in case something goes wrong. So ideally we'd make sure this does not happen in grub's "safe mode" boot option.
More for future boxes, I'm not sure if you have physical access to these boxes or how they are maintained. If you ever upgrade to a new spec, if you can I would almost go for systems with IPMI, though not that common on consumer boards you often need more a workstation board. Benefits are you can remotely manage the systems (power on/off, serial console, VGA...) and you also have a dumb VGA you can use. Of course a cheap card can work too, but you may like remote management and be able to just put a "farm" somewhere in a corner without keyboard and monitor.
- Although I have not done any test yet I'm reasonably certain that PCI-passthrough rules out live snapshots: QEmu would have no way to restore the graphics card's internal state.
Correct, hardware state is an issue. (For professional uses Nvidia provides such feature). One workaround, which kind of worked at the time was to enter sleep mode in which drivers need to handle some state recovery. It worked for games, but probably not worth the effort at all.
For Windows VMs that's not an issue: if we provide a power off snapshot the TestBot already knows how to power on the VM and wait for it to boot (as long as the boot is shorter than the connection timeout but it works out usually).
For Linux VM's that's more of an issue: the TestBot will power on the VM as usual. The problem is when it updates Wine: after recompiling everything it deletes the old snapshot and creates a new one from the current state of the VM, which means a live snapshot. So the TestBot will need to be modified so it knows when and how to power off the VM and take a powered off snapshot.
Since the VM has full control of the graphics card QEmu has no access to the content of the screen. That's not an issue for the normal TestBot operation, just for the initial VM setup. Fortunately the graphics card is connected to a KVM so the screen can be accessed through that means. It does mean assigning the mouse and keyboard to the VM too. Should that prove impractical there are a bunch of other options too: VNC, LookingGlass, Synergy, etc. But the less needs to be installed in the VMs the better.
Also the TestBot uses QEmu to take the screenshots. But QEmu does not have access to the content of the screen. The fix is to use a tool to take the screenshots from within the VM and use TestAgent to retrieve them. On Linux there are standard tools we can use. On Windows there's code floating around we can use.
So the next steps would be:
- Maybe test on my box using the builtin IGP. But that likely won't be very conclusive beyond confirming the snapshot issues, screen access, etc.
- Find a suitable AMD or Nvidia graphics card and test that on my box. That would allow me to fully test integration with the TestBot, check for stability issues, etc.
- Then see what can be done with the existing cw1 and cw2 boxes.
Overall pcie passthrough is definitely the way to go. We have used in a huge capacity for years and it works very well. I would recommend using it here too.
Thanks, Roderick
----- On Apr 25, 2019, at 4:53 PM, Roderick Colenbrander [email protected] wrote:
On Sat, Apr 20, 2019 at 11:35 PM Francois Gouget [email protected] wrote:
- Second the motherboard too needs to support VT-d. Both machines have an ASRock P67 Extreme4 motherboard. Unfortunately UEFI says "unsupported" next to the "VT-d" setting for the motherboard :-( It looks like there was some confusion as to whether the P67 chipset supported VT-d initially. From what I gathered it's only Q67 that does but this caused some manufacturers, among which ASRock, to initially claim support and later retract it.
From memory this Asrock board likely works okay. Back in the days we were a very early adopter of Vt-d/iommu working closely with Intel / Nvidia. Especially first gen i7 motherboards were very, very buggy with vt-d. Often not supporting it or else advertising support and having bad bugs preventing it from working. I had to test dozens of motherboards. Asrock at that time generally worked.
Thanks, Roderick
Not got any huge experience with exactly this motherboard, but the first bios'es available for those chipsets (P67) was probably a bit sketchy. You might consider checking if there is a newer/beta bios to test perhaps? (With any of the problems that might cause ofc). https://www.asrock.com/mb/Intel/P67 Extreme4/#BIOS
I guess UEFI is somewhat troublesome even if "overriding" this with kernel options when UEFI reports "NA" for the function. I dunno if you can "force" it like that anyway?
As long as it's not a "K" processor, the 2600 is working with vt-d, where the 2600K does not (as you probably are aware of).
Sveinar
On Fri, 26 Apr 2019, Sveinar Søpler wrote: [...]
Not got any huge experience with exactly this motherboard, but the first bios'es available for those chipsets (P67) was probably a bit sketchy. You might consider checking if there is a newer/beta bios to test perhaps? (With any of the problems that might cause ofc). https://www.asrock.com/mb/Intel/P67 Extreme4/#BIOS
Ours is a variant, the P67 Extreme4 Gen3 and as far as I can tell we have the latest BIOS: P2.20
https://www.asrock.com/MB/Intel/P67%20Extreme4%20Gen3/index.asp#BIOS
As long as it's not a "K" processor, the 2600 is working with vt-d, where the 2600K does not (as you probably are aware of).
It is indeed a i7-2600K and you're right, it does not support VT-d. So the 'Intel Virtualization Technology: Enabled' in UEFI must be referring to VT-x. Which means the issue with VT-d may not actually come from the motherboard.
I attached UEFI screenshots for reference.
So anyway... we'll need new hardware.
----- On Apr 27, 2019, at 6:39 PM, Francois Gouget [email protected] wrote:
As long as it's not a "K" processor, the 2600 is working with vt-d, where the 2600K does not (as you probably are aware of).
It is indeed a i7-2600K and you're right, it does not support VT-d. So the 'Intel Virtualization Technology: Enabled' in UEFI must be referring to VT-x. Which means the issue with VT-d may not actually come from the motherboard.
I attached UEFI screenshots for reference.
So anyway... we'll need new hardware.
-- Francois Gouget [email protected]
Yeah, you need vt-d to get the "PCI passthrough" function which is what you are aiming for, and the "K" does not have that ref. intel spec.
Intel® Virtualization Technology for Directed I/O (VT-d) = NO https://ark.intel.com/content/www/us/en/ark/products/52214/intel-core-i7-260...
Sveinar
On Thu, 25 Apr 2019, Roderick Colenbrander wrote: [...]
From memory this Asrock board likely works okay. Back in the days we were a very early adopter of Vt-d/iommu working closely with Intel / Nvidia.
It turns out it's the processor that does not support VT-d. So Asrock is off the hook.
[...]
For an AMD card, the main hassles are that some of them have "PCIe reset" issues, which may prevent a VM from booting the card. AMD is not like Nvidia trying to block virtualization on consumer cards. Their Radeon Pro cards can sometimes be a little better. A cheap Radeon Pro WX2100 is for example a fine card.
I found The Passthrough Post website which has a list of compatible hardware. I think I'll do some tests here with a Gigabyte RX 550 D5 and then we can pick something from that range.
https://passthroughpo.st/vfio-increments/
On the Nvidia side I don't know if picking a Quadro instead of a consumer graphics card would have an impact on the Wine tests.
Otherwise it seems the code 43 error and MSI interrupt issues are reasonably well understood and can reliably be worked around these days.
On Sat, 27 Apr 2019 at 21:23, Francois Gouget [email protected] wrote:
On Thu, 25 Apr 2019, Roderick Colenbrander wrote: [...]
For an AMD card, the main hassles are that some of them have "PCIe reset" issues, which may prevent a VM from booting the card. AMD is not like Nvidia trying to block virtualization on consumer cards. Their Radeon Pro cards can sometimes be a little better. A cheap Radeon Pro WX2100 is for example a fine card.
I found The Passthrough Post website which has a list of compatible hardware. I think I'll do some tests here with a Gigabyte RX 550 D5 and then we can pick something from that range.
https://passthroughpo.st/vfio-increments/
On the Nvidia side I don't know if picking a Quadro instead of a consumer graphics card would have an impact on the Wine tests.
I imagine the considerations in question are different between Nouveau and the proprietary drivers.
On Sat, Apr 27, 2019 at 9:52 AM Francois Gouget [email protected] wrote:
On Thu, 25 Apr 2019, Roderick Colenbrander wrote: [...]
From memory this Asrock board likely works okay. Back in the days we were a very early adopter of Vt-d/iommu working closely with Intel / Nvidia.
It turns out it's the processor that does not support VT-d. So Asrock is off the hook.
[...]
For an AMD card, the main hassles are that some of them have "PCIe reset" issues, which may prevent a VM from booting the card. AMD is not like Nvidia trying to block virtualization on consumer cards. Their Radeon Pro cards can sometimes be a little better. A cheap Radeon Pro WX2100 is for example a fine card.
I found The Passthrough Post website which has a list of compatible hardware. I think I'll do some tests here with a Gigabyte RX 550 D5 and then we can pick something from that range.
https://passthroughpo.st/vfio-increments/
On the Nvidia side I don't know if picking a Quadro instead of a consumer graphics card would have an impact on the Wine tests.
Otherwise it seems the code 43 error and MSI interrupt issues are reasonably well understood and can reliably be worked around these days.
They continue to tighten the checks in the drivers. You need to disable high performance timers and other desirable features (though we don't care that much about performance). It is just very likely to break quickly and an uphill battle.
The Quadros should be similar enough. It is mostly presets for enterprise apps and here and there improved hardware e.g. some have ECC or more fp64. (ECC could be nice as Geforce cards are not meant for 24/7 operation, Quadros can handle it better. We found out the hard way..)
If you are also looking at upgrading the system itself. I would suggest these days to take a nice AMD Threadripper system as it has plenty of PCIe. Put in multiple GPUs and you have multiple test boxes e.g. 4 in one. AMD's iommu works well too.
Thanks, Roderick
On Sun, 21 Apr 2019, Francois Gouget wrote:
Here are some things I've learned about PCI-passthrough recently, which would be one way (probably the best) to add "real hardware" to the TestBot.
I have finally done some tests with PCI-passthrough on my box and got it working with a Windows 10 VM outside of the TestBot.
Hardware: i7-4790K + Asus Z97-A + AMD RX 550 Software: Debian 10 + kernel 5.5.0-0.bpo.2-amd64 + QEmu 5.0-14~bpo10+1
Things I learned:
* Everyone recommends using OVMF, QEMu's UEFI BIOS. Currently that's totally useless for the TestBot: not only is it impossible to take live snapshots with OVMF, you cannot even take snapshots of the powered off VM!!! And that's even before PCI-passthrough enters the picture. Such a VM would be no better than running the tests on the bare metal in terms of getting back to a clean state.
* But getting PCI-passthrough going with OVMF was indeed easier.
* It's possible to combine the QXL+Spice screen and PCI-passthrough for a dual-GPU VM configuration. The benefit is you at least get the QXL screen and then can work out the kinks for the extra GPU.
* In this configuration you can also use the host's keyboard and mouse although things will be wonky as soon as you extend your screen with the second GPU. The reason is that you normally have to exit your main screen by the side to get on the second one. But in this configuration that just gets your mouse out of the Spice window. So what Spice/QEmu seems to be doing is matching the left (resp. right) edge of your Spice window to the left (right) edge of your leftmost (rightmost) screen. So the mouse switches to the second screen somewhere in the middle which means all clicks are offset from the mouse pointer. Yuck! So look for hovering highlights and learn to use keyboard shortcuts.
* Once you remove the QXL+Spice screen the VM will be headless because it does not know there is still a VGA device (that's the part OVMF handles better). The fix is to manually edit the VM's XML file to pass x-vga=on on the right device:
-<domain type='kvm'> +<domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'> ... +qemu:commandline + <qemu:arg value='-set'/> + <qemu:arg value='device.hostdev0.x-vga=on'/> +</qemu:commandline>
Where 'hostdev0' is the GPU (and hostdev1 is the matching audio card).
* With that configuration it's possible to take snapshots of the powered off VM. So that's something the TestBot can work with... eventually.
* You can also remove the ich9 audio device and use the graphics card one instead. There's even an option (somewhere) to make it work through a DVI-to-HDMI cable (my other cables at hand were too short).
* I also did tests with a Debian 10 VM but those attempts were thrown off by: - The x-vga issue above. - My old screen which simply does not work with the AMD graphics card (and one other laptop out of two, except when going through a VGA adapter). - Probably some GPU driver setup issues (initially I was missing firmware-linux-nonfree). So I'll have to retry.
* During my Debian 10 tests I ended up crashing the RX 550 such that I got a QEmu error on the host side and could not restart that VM until I rebooted the host. Ouch! I really don't want to have to reboot the host after each test. Also I thought AMD's reset issues only concerned the Southern Islands (HD 7000) and Sea Island (Rx 200) GPUs. But then I did not run into this issue again when testing the Windows 10 guest. So maybe there's still hope.
So that's the state of things for now. I don't know when I'll get back to this but the next steps will likely be:
1. Setting up a proper Windows VM with PCI-passthrough and add it to the TestBot.
2. Add it to my local TestBot instance and see how stable that is.
3. Give the Debian 10 VM another try.
4. Maybe set up a Windows PCI-passthrough VM on the official TestBot. vm3 and vm4 look like they should support it (E3-1226 v3 + C224 PCH). However I don't know if the chassis has room to put a graphics card.