TestBot: The new w10pro64 VM
Francois Gouget
fgouget at codeweavers.com
Wed Sep 23 11:43:18 CDT 2020
So I have put w10pro64 into production.
As the name implies this is a 64-bit Windows 10 Professional VM. What
the name does not say is that it runs the latest version of Windows 10:
2004. That means it has more failures than the others... for now.
The goal is to use it to balance the load across two VM hosts. So it
will run the various language tests, always against the latest Windows
10 release, while w1064 will deal with the previous Windows 10 releases
and other configurations such as dual-screen and (hopefully) PCI
passthrough.
Right now w10pro64 also runs the dual-screen tests because it has a
newer QXL driver that should have fewer failures (bug 48926) but that
should change after I update w1064.
For those who are interested I did quite a few tests on w10pro64 before
putting it in production to see the impact of the QEmu configuration.
One part of it was to see if it was possible to reduce the number of
failures by tweaking the configuration. That did not yield any
meaningful result.
The other part was to check various options' impact on performance.
CPU: IvyBridge * 3 cores
------------------------
IvyBridge is the baseline of our current VM hosts (vm1, vm3 and vm4). So
it should be possible to move the VM from one host to the other without
changing its configuration (and also without risking upsetting Windows'
license checks).
Most of our tests are single threaded. But in order to root out race
conditions I think all VMs should have at least 2 vcpus. The question
was whether adding more would help.
So I used mpstat at a 5 second interval to trace the CPU usage on the
host while WineTest ran in the VM. I mostly ran the tests with 4 vcpus
(specifically 4 cores to avoid licensing issues). The host has 4 cores.
This showed that even when given 4 cores the VM spends 70% to 80%
(depending on the run) of its time using less than one core, 97% using
less than two cores and only 0.5% using more than 3 cores. So giving it
two or three cores is plenty.
So what is the CPU doing when not running the VM / tests? The stats show
it waits for IO only 3% of the time which is as it should given the
caching available on the host and the SSD disk. System and user CPU
usage are also pretty low so most of the time the CPU is just idle. More
specifically the host is 75% idle (i.e. uses less than 1 core) more than
50% of the time.
The why is still somewhat of a mystery to me. Idle time can result from
the audio tests (waiting for the buffered sound to play) and network
tests (waiting for network data). There are also a few places where we
wait for some operation to time out but surely not that many? So how can
we eliminate this idle time and speed up the tests?
Memory : 4GB
------------
A test with 8GB shows adding memory does not help the test or allow them
to run faster.
I prefer limiting how much memory the VMs use because I expect it to
result in smaller live snapshots: w10pro64's disk image shot from 14 GB
to 53GB when I added the 13 live snaphosts. That works out to about 3GB
per live snapshot (disk COW+RAM). Interestingly it's less than the VM's
amount of memory which means QEmu does not save the unused memory. But I
suspect QEmu still saves Windows disk cache so that increasing memory
result in bigger snapshots.
Clock : HPET
------------
Initially the guest was using a significant amount of CPU on the host
even when Windows was doing nothing. It turns out this is because by
default libvirt does not add the HPET timer. Adding the following line
fixed this:
<clock offset='localtime'>
[...]
<timer name='hpet' present='yes'/>
</clock>
Disk: Virtio SCSI + unmap
-------------------------
The SCSI Virtio driver is the recommended configuration and I manually
set the discard mode to unmap to prevent qcow2 bloat (is that QEmu's
default?).
Then I tested the disk performance with ATTO.
https://www.atto.com/disk-benchmark/
* In its default configuration ATTO uses a small 128 MB test file. Since
such a small file easily fits in the OS' cache ATTO uses fsync-like
functionality to ensure it tests the disk performance rather than the
memory's.
* But in the default QEmu configuration (writeback mode) caching still
occurs outside the VM which fools Atto and results in read and write
speeds in the GB/s range on a SATA SSD (see
w10pro64_scsi+default+unmap.png). But then our tests don't write all
that much to disk so this test is quite realistic. All in all this
means the default configuration should provide more than fast enough
disk access.
* The results are the same when caching is explicitly set to writeback
(i.e. it's QEmu's default). (see wtbw10pro64_scsi+writeback+unmap.png)
* I also ran an ATTO test with a bigger file size (see
w10pro64_scsi+default+unmap+4GB.png). We then clearly see writes being
capped by the SSD speed while reads still benefit from the host cache.
This shows that disk performance is still ok even when writing more
data.
* Some sites recommend setting io.mode=threads but that forces
cache.mode=none or directsync. That prevents the host from doing extra
caching and then we find the true underlying disk performance in ATTO.
I think that configuration makes sense when one wants to be sure the
VM's filesystem will remain in a consistent state in case of a host
crash or power outage. But in such a case we would just revert the VM
to the last snapshot and continue. Then the default configuration
provides us with better disk performance. (see
w10pro64_scsi+directsync+native+unmap.png and for comparison
directsync alone w10pro64_scsi+directsync+unmap.png)
--
Francois Gouget <fgouget at codeweavers.com>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: w10pro64.xls.bz2
Type: application/octet-stream
Size: 159505 bytes
Desc: w10pro64.xls.bz2
URL: <http://www.winehq.org/pipermail/wine-devel/attachments/20200923/56c035dc/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: w10pro64_scsi+default+unmap.png
Type: image/png
Size: 41120 bytes
Desc: w10pro64_scsi+default+unmap.png
URL: <http://www.winehq.org/pipermail/wine-devel/attachments/20200923/56c035dc/attachment-0005.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: w10pro64_scsi+default+unmap+4GB.png
Type: image/png
Size: 40840 bytes
Desc: w10pro64_scsi+default+unmap+4GB.png
URL: <http://www.winehq.org/pipermail/wine-devel/attachments/20200923/56c035dc/attachment-0006.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: w10pro64_scsi+writeback+unmap.png
Type: image/png
Size: 40371 bytes
Desc: w10pro64_scsi+writeback+unmap.png
URL: <http://www.winehq.org/pipermail/wine-devel/attachments/20200923/56c035dc/attachment-0007.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: w10pro64_scsi+directsync+native+unmap.png
Type: image/png
Size: 40701 bytes
Desc: w10pro64_scsi+directsync+native+unmap.png
URL: <http://www.winehq.org/pipermail/wine-devel/attachments/20200923/56c035dc/attachment-0008.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: w10pro64_scsi+directsync+unmap.png
Type: image/png
Size: 40202 bytes
Desc: w10pro64_scsi+directsync+unmap.png
URL: <http://www.winehq.org/pipermail/wine-devel/attachments/20200923/56c035dc/attachment-0009.png>
More information about the wine-devel
mailing list