Friday, December 20, 2019

A 2019 recap (and a bit of an outlook)

The holiday season for 2019 will soon be upon us, so I decided to do a quick recap of what I consider the highlights for this year, from my perspective, regarding s390x virtualization, and the wider ecosystem.

Conferences

I attended the following conferences last year.

Linux Plumbers Conference

LPC 2019 was held in Lisbon, Portugal, on September 9-11. Of particular interest for me was the VFIO/IOMMU/PCI microconference. I talked a bit about cross-architecture considerations (and learned about some quirks on other architectures as well); the rest of the topics, while not currently concerning my work directly, were nevertheless helpful to move things forward. As usual on conferences, the hallway track is probably the most important one; met some new folks, saw others once again, and talked more about s390 I/O stuff than I anticipated. I can recommend this conference for meeting people to talk to about (not only) deeply technical things.

KVM Forum

KVM Forum 2019 was held in Lyon, France, on October 30 - November 1. As usual, a great place to meat people and have discussions in the hallway, e.g. about vfio migration. No talk from me this year, but an assortment of interesting topics presented by others; I contributed to an article on LWN.net (https://lwn.net/Articles/805097/). Of note from an s390x perspective were the talks about protected virtualization and nested testing mentioned in the article, and also the presentation on running kvm unit tests beyond KVM.

s390x changes in QEMU and elsewhere

There's a new machine (z15) on the horizon, but support for older things has been introduced or enhanced as well.

Emulation

Lots of work has gone into tcg to emulate the vector instructions introduced with z13. Distributions are slowly switching to compiling for z13, which means gcc is generating vector instructions. As of QEMU 4.2, you should be able to boot recent distributions under tcg once again.

vfio

vfio-ccw now has gained support for sending HALT SUBCHANNEL and CLEAR SUBCHANNEL to the real device; this is useful e.g. for error handling, when you want to make sure an operation is really terminated at the device. Also, it is now possible to boot from a DASD attached via vfio-ccw.

vfio-ap has seen some improvements as well, including support for hotplugging the matrix device and for interrupts.

Guest side

A big change on the guest side of things was support for protected virtualization (also see the talk given at KVM Forum). This is a bit like AMD's SEV, but (of course) different. Current Linux kernels should be ready to run as a protected guest; host side support is still in progress (see below).

Other developments of interest

mdev, mdev everywhere

There has been a lot of activity around mediated devices this year. They have been successfully used in various places in the past (GPUs, vfio-ccw, vfio-ap, ...). A new development is trying to push parts of it into userspace ('muser', see the talk at KVM Forum). An attempt was made to make use of the mediating part without the vfio part, but that was met with resistance. Ideas for aggregation are still being explored.

In order to manage and persist mdev devices, we introduced the mdevctl tool, which is currently included in at least Fedora and Debian.

vfio migration

Efforts to introduce generic migration support for vfio (or at least in the first shot, for pci) are still ongoing. Current concerns mostly cycle around dirty page tracking. It might make sense to take a stab at wiring up vfio-ccw once the interface is stable.

What's up next?

While there probably will be some not-yet-expected developments next year, some things are bound to come around in 2020.

Protected virtualization

Patch sets for KVM and QEMU to support protected virtualization on s390 have already been posted this year; expect new versions of the patch sets to show up in 2020 (and hopefully make their way into the kernel respectively QEMU).

vfio-ccw

Patches to support detecting path status changes and relaying them to the guest have already been posted; expect an updated version to make its way into the kernel and QEMU in 2020. Also likely: further cleanups and bugfixes, and probably some kind of testing support, e.g. via kvm unit tests. Migration support also might be on that list.

virtio-fs support on s390x

Instead of using virtio-9p, virtio-fs is a much better way to share files between host and guest; expect support on s390x to become available once sharing of fds in QEMU without numa becomes possible. Shared memory regions on s390x (for DAX support) still need discussion, however.

Wednesday, November 6, 2019

s390x changes in QEMU 4.2

You know the drill: QEMU is entering freeze (this time for 4.2), and there's a post on the s390x changes for the upcoming release.

TCG

  • Emulation for  IEP (Instruction Execution Protection), a z14 feature, has been added.
  • A bunch of fixes in the vector instruction emulation and in the fault-handling code.

KVM

  • For quite some time now, the code has been implicitly relying on the presence of the 'flic' (floating interrupt controller) KVM device (which had been added in Linux 3.15). Nobody really complained, so we won't try to fix this up and instead make the dependency explicit.
  • The KVM memslot handling was reworked to be actually sane. Unfortunately, this breaks migration of huge KVM guests with more than 8TB of memory from older QEMUs. Migration of guests with less than 8TB continues to work, and there's no planned breakage of migration of >8TB guests starting with 4.2.

CPU models

  • We now know that the gen15a is called 'z15', so reflect this in the cpu model description.
  • The 'qemu' and the 'max' models gained some more features.
  • Under KVM, 'query-machines' will now return the correct default cpu model ('host-s390x-cpu').

Misc

  • The usual array of bugfixes, including in SCLP handling and in the s390-ccw bios.

Wednesday, July 10, 2019

s390x changes in QEMU 4.1

QEMU has just entered hard freeze for 4.1, so the time is here again to summarize the s390x changes for that release.

TCG

  • All instructions that have been introduced with the "Vector Facility" in the z13 machines are now emulated by QEMU. In particular, this allows Linux distributions built for z13 or later to be run under TCG (vector instructions are generated when we compile for z13; other z13 facilities are optional.)

CPU Models

  • As the needed prerequisites in TCG now have been implemented, the "qemu" cpu model now includes the "Vector Facility" and has been bumped to a stripped-down z13.
  • Models for the upcoming gen15 machines (the official name is not yet known) and some new facilities have been added.
  • If the host kernel supports it, we now indicate the AP Queue Interruption facility. This is used by vfio-ap and allows to provide interrupts for AP to the guest.

I/O Devices

  • vfio-ccw has gained support for relaying HALT SUBCHANNEL and CLEAR SUBCHANNEL requests from the guest to the device, if the host kernel vfio-ccw driver supports it. Otherwise, these instructions continue to be emulated by QEMU, as before.
  • The bios now supports IPLing (booting) from DASD attached via vfio-ccw.

Booting

  • The bios tolerates signatures written by zipl, if present; but it does not actually handle them. See the 'secure' option for zipl introduced in s390-tools 2.9.0.
And the usual fixes and cleanups.

Tuesday, March 12, 2019

s390x changes in QEMU 4.0

QEMU is now entering softfreeze for the 4.0 release (expected in April), so here is the usual summary of s390x changes in that release.

CPU Models

  • A cpu model for the z14 GA 2 has been added. Currently, no new features have been added.
  • The cpu model for z14 now does, however, include the multiple epoch and PTFF enhancement features per default.
  • The 'qemu' cpu model now includes the zPCI feature per default. No more prerequisites are needed for pci support (see below).

Devices


  • QEMU for s390x is now always built with pci support. If we want to provide backwards compatibility,  we cannot simply disable pci (we need the s390 pci host bus); it is easier to simply make pci mandatory. Note that disabling pci was never supported by the normal build system anyway.
  • zPCI devices have gained support for instruction counters (on a Linux guest, these are exposed through /sys/kernel/debug/pci/<function>/statistics).
  • zPCI devices always lacked support for migrating their s390-specific state (not implemented...); if you tried to migrate a guest with a virtio-pci device on s390x, odd things might happen. To avoid surprises, the 'zpci' devices are now explicitly marked as unmigratable. (Support for migration will likely be added in the future.)
  • Hot(un)plug of the vfio-ap matrix device is now supported.
  • Adding a vfio-ap matrix device no longer inhibits usage of a memory ballooner: Memory usage by vfio-ap does not clash with the concept of a memory balloon.

TCG

  • Support for the floating-point extension facility has been added.
  • The first part of support for z13 vector instructions has been added (vector support instructions). Expect support for the remaining vector instructions in the next release; it should support enough of the instructions introduced with z13 to be able to run a distribution built for that cpu.