Wednesday, July 29, 2020

Configuring mediated devices (Part 2)

In the last part of this article, I talked about configuring a mediated device directly via sysfs. This is a bit cumbersome, and you may want to make your configuration more permanent. Fortunately, there is tooling available for this.

driverctl: bind to the correct driver

driverctl is a tool to manage the driver that a device may bind to. As a device that is supposed to be used via vfio will need to be bound to a vfio driver instead of its 'normal' driver, it makes sense to add some configuration that makes sure that this binding is actually done automatically. While driverctl had originally been implemented to work with PCI devices, the css bus (for subchannel devices) supports management with driverctl as of Linux 5.3 as well. (The ap bus for crypto devices does not support setting driver overrides, as it implements a different mechanism.)

Example (vfio-ccw)

Let's reuse the example from the last post, where we wanted to assign the device behind subchannel 0.0.0313 to the guest. In order to set a driver override, use

[root@host ~]# driverctl -b css set-override 0.0.0313 vfio_ccw

If the subchannel is not currently bound to the vfio-ccw driver already, it will be unbound from its driver and bound to vfio_ccw. Moreover, a udev rule to bind the subchannel to vfio_ccw automatically in the future will be added.

Unfortunately, a word of caution regarding the udev rule is in order: As uevents on the css bus for I/O subchannels are delayed until after device recognition has been performed, automatic binding may not work out as desired. We plan to address that in the future by reworking the way the css bus handles uevents; until then, you may have to trigger a rebind manually. Also, keep in mind that the subchannel id for a device may not be stable (as mentioned previously); automation should be used cautiously in that case.

mdevctl: manage mediated devices

The more tedious part of configuring a passthrough setup is configuring and managing mediated devices. To help with that, mdevctl has been written. It can create, modify, and remove mediated devices (and optionally make those changes persistent), work with configurations and devices created via other means, and list mediated devices and the different types that are supported.

Creating a mediated device

In order to create a mediated device, you need a uuid. You can either provide your own (as in the manual case), or let mdevctl pick one for you. In order to get the same configuration as in the manual configuration examples, let's create a vfio-ccw device with the same uuid as before.

The following command defines the same mediated device as in the manual example:
  
 [root@host ~]# mdevctl define -u 7e270a25-e163-4922-af60-757fc8ed48c6 -p 0.0.0313 -t vfio_ccw-io -a

Note the '-a', which instructs mdevctl to start the device automatically from now on.

After you've created the device, you can check which devices mdevctl is now aware of:

  [root@host ~] # mdevctl list -d
 7e270a25-e163-4922-af60-757fc8ed48c6 0.0.0313 vfio_ccw-io

Note that the '-d' instructs mdevctl to show defined, but not started devices.

Let's start the device:

  [root@host ~] # mdevctl start -u 7e270a25-e163-4922-af60-757fc8ed48c6
 [root@host ~] # mdevctl list -d
  7e270a25-e163-4922-af60-757fc8ed48c6 0.0.0313 vfio_ccw-io auto (active)

The mediated device is now ready to be used and can be passed to a guest.

Making your configuration persistent

If you already created a mediated device manually, you may want to reuse the existing configuration and make it persistent, instead of starting from scratch.

So, let's create another vfio-ccw the manual way:

 [root@host ~] # uuidgen
  b29e4ca9-5cdb-4ee1-a01b-79085b9ab237
 [root@host ~] # echo "b29e4ca9-5cdb-4ee1-a01b-79085b9ab237" > /sys/bus/css/drivers/vfio_ccw/0.0.0314/mdev_supported_types/vfio_ccw-io/create

mdevctl now actually knows about the active device (in addition to the device we configured before):

  [root@host ~] # mdevctl list
  b29e4ca9-5cdb-4ee1-a01b-79085b9ab237 0.0.0314 vfio_ccw-io
  7e270a25-e163-4922-af60-757fc8ed48c6 0.0.0313 vfio_ccw-io (defined)

But it obviously does not have a definition for the manually created device:

  [root@host ~] # mdevctl list -d
  7e270a25-e163-4922-af60-757fc8ed48c6 0.0.0313 vfio_ccw-io auto (active)

On a restart, the new device would be gone again; but we can make it persistent:

  [root@host ~] # mdevctl define -u b29e4ca9-5cdb-4ee1-a01b-79085b9ab237
  [root@host ~ ] mdevctl list
  b29e4ca9-5cdb-4ee1-a01b-79085b9ab237 0.0.0314 vfio_ccw-io (defined)
  7e270a25-e163-4922-af60-757fc8ed48c6 0.0.0313 vfio_ccw-io (defined)

If you check under /etc/mdevctl.d/, you will find that an appropriate JSON file has been created:

  [root@host ~] # cat /etc/mdevctl.d/0.0.0314/b29e4ca9-5cdb-4ee1-a01b-79085b9ab237 
  {
    "mdev_type": "vfio_ccw-io",
    "start": "manual",
    "attrs": []
  }

(Note that this device is not automatically started by default.)

Modifying an existing device

There are good reasons to modify an existing device: you may want to modify your setup, or, in the case of vfio-ap, you need to modify some attributes before being able to use the device in the first place.

Let's first create the device. This command creates the same device as created manually in the last post:

  [root@host ~] # mdevctl define -u "669d9b23-fe1b-4ecb-be08-a2fabca99b71" --parent matrix --type vfio_ap-passthrough
 [root@host ~] # mdevctl list -d
  669d9b23-fe1b-4ecb-be08-a2fabca99b71 matrix vfio_ap-passthrough manual

This device is not yet very useful, as you still need to assign some queues to it. It now looks like this:

  [root@host ~]  # mdevctl list -d -u 669d9b23-fe1b-4ecb-be08-a2fabca99b71 --dumpjson
  {
    "mdev_type": "vfio_ap-passthrough",
    "start": "manual"
  }

Let's modify the device and add some queues:

  [root@host ~] # mdevctl modify -u 669d9b23-fe1b-4ecb-be08-a2fabca99b71 --addattr=assign_adapter --value=5
 [root@host ~] # mdevctl modify -u 669d9b23-fe1b-4ecb-be08-a2fabca99b71 --addattr=assign_domain --value=4
 [root@host ~] # mdevctl modify -u 669d9b23-fe1b-4ecb-be08-a2fabca99b71 --addattr=assign_domain --value=0xab

The device's JSON now looks like this:

  [root@host ~] # mdevctl list -d -u 669d9b23-fe1b-4ecb-be08-a2fabca99b71 --dumpjson
{
  "mdev_type": "vfio_ap-passthrough",
  "start": "manual",
  "attrs": [
    {
      "assign_adapter": "5"
    },
    {
      "assign_domain": "4"
    },
    {
      "assign_domain": "0xab"
    }
  ]
}

This is now exactly what we had defined manually in the last post.

But what if you notice that you want domain 0x42 instead of domain 4? Just modify the definition. To make it easier to figure out how to specify the attribute to manipulate, use this output:

  [root@host ~] # devctl list -dv -u 669d9b23-fe1b-4ecb-be08-a2fabca99b71
669d9b23-fe1b-4ecb-be08-a2fabca99b71 matrix vfio_ap-passthrough manual
  Attrs:
    @{0}: {"assign_adapter":"5"}
    @{1}: {"assign_domain":"4"}
    @{2}: {"assign_domain":"0xab"}

You want to remove attribute 1, and add a new value:

  [root@host ~] # mdevctl modify -u 669d9b23-fe1b-4ecb-be08-a2fabca99b71 --delattr --index=1
  [root@host ~] # mdevctl modify -u 669d9b23-fe1b-4ecb-be08-a2fabca99b71 --addattr=assign_domain --value=0x42

Let's check that it now looks as desired:

  [root@host ~] # mdevctl list -dv -u 669d9b23-fe1b-4ecb-be08-a2fabca99b71
669d9b23-fe1b-4ecb-be08-a2fabca99b71 matrix vfio_ap-passthrough manual
  Attrs:
    @{0}: {"assign_adapter":"5"}
    @{1}: {"assign_domain":"0xab"}
    @{2}: {"assign_domain":"0x42"}

Future development

While mdevctl works perfectly fine for managing individual mediated devices, it does not maintain a view of the complete system. This means you notice conflicts between two devices only when you try to activate the second one. In the case of vfio-ap, the rules to be considered are complex, and there is quite some potential for conflict. In order to be able to catch that kind of problem early, we plan to add callouts to mdevctl, which would e.g. allow to invoke a tool for validation when a new device is added, but before it is activated. This is potentially useful for other device types as well.