Tuesday, March 21, 2017

Channel I/O: What's in a channel subsystem?

When you start trying to get familiar with channel I/O and its concepts, one thing you notice is usually a host of very similar-sounding acronyms that are easily confused. The easiest way to get a hold of this is probably to look at a small machine started by QEMU and to examine what a Linux guest sees.

So, let's start with the following command line: 

s390x-softmmu/qemu-system-s390x -machine s390-ccw-virtio,accel=kvm -m 1024 -nographic -drive file=/dev/dasdb,if=none,id=drive-virtio-disk0,format=raw,serial=ccwdasd1,cache=none -device virtio-blk-ccw,devno=fe.0.0042,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1,scsi=off

(Note that this assumes you're running an s390x system and have a bootable system on /dev/dasdb.)

This will start up a machine with two channel devices: one virtio-blk (as specified on the command line) and one virtio-net (always autogenerated unless explicitly turned off).

Let's log into the guest via the console1 and examine what channel devices Linux sees:
[root@localhost ~]# lscss
Device   Subchan.  DevType CU Type Use  PIM PAM POM  CHPIDs
----------------------------------------------------------------------
0.0.0000 0.0.0000  0000/00 3832/01 yes  80  80  ff   00000000 00000000
0.0.0042 0.0.0001  0000/00 3832/02 yes  80  80  ff   00000000 00000000

Let's go through this information column-by-column.

Device is the identifier for the device, which is unique guest-wide. The xx.y.zzzz format (often called bus id) is specific to Linux (and has leaked over into QEMU) and is made up of the following elements:
  • The channel subsystem id (cssid) xx (here: 0, as on all current Linux systems)
  • The subchannel set id (ssid) y (here: 0, can be any value from 0-3 on current Linux systems)
  • The device number (devno) zzzz (here: 0000 respectively 0042, can be any value from 0-0xffff)
The two values in this example have different origins:
  • 0.0.0000 (the virtio-net device) has been autogenerated.
  • 0.0.0042 (the virtio-blk device) has been specified on the command line.
But wait: The value on the command line was fe.0.0042, wasn't it? I will explain this in a later post; just remember for now that you specify the cssid fe for a virtio device on the QEMU command line and it will show up as cssid 0 in the Linux guest.

The devno basically belongs to the device; cssid and ssid indicate the addressing within the channel subsystem, which is why we encounter them again in the next id, Subchan.

This is the identifier for the subchannel, which is basically the means to actually address the device. It again uses the xx.y.zzzz format and is made up of the following elements:
  • The cssid xx (same as for the device)
  • The ssid y (again, the same as for the device)
  • The subchannel number zzzz (here: 0000 respectively 0001, generally not the same as the devno, although it can be any value from 0-0xffff as well)
These values are always autogenerated by QEMU (i.e., you can't specify them on the command line). They basically depend on the order in which devices are initialized (either from the initial command line, autogenerated or via device hotplug) - the only restriction is that the cssid and ssid are set by the device's bus id, if specified. The reasoning behind this is that a subchannel is only a means to access the device and as such needs only to be unique, but not pre-defined.

In contrast to the bus id for a device (which is a Linux and QEMU construct), the bus id for a subchannel actually has an equivalent in the architecture: the subchannel-identification word (often referred to as schid in Linux and QEMU), which is basically a 32 bit value composed of the cssid, the ssid, and the subchannel number. This is used to address a device via a certain subchannel by the various channel I/O related instructions.

The next two columns, DevType and CU Type, are part of the self description element of channel devices: The concept is that the operating system asks the device nicely to identify itself and the device responds with information about its type and what it can do. The device and the control unit are, in principle, two separate, cascaded entities; for virtio purposes, you can think of the device as the virtio backend (like the virtio-blk device) and of the control unit as the virtio proxy device (like the pci device used to access virtio devices on other platforms). That's also the reason why the device type is always zero for virtio devices. The control unit type is of the form aaaa/bb and consists of the following elements:
  • The type aaaa (a value from 0-0xffff; 0x3832 denotes a virtio-ccw control unit)
  • The model bb (a value from 0-0xff; for virtio devices, this is the device id as specified by the virtio standard)
In our example, we can therefore see that device 0.0.0000 is a virtio-net device (CU model 1) and device 0.0.0042 is a virtio-blk device (CU model 2).

The next column, Use, points to a big difference from other I/O architectures: In order to be able to use a subchannel to talk to a device, the operating system first needs to enable it. For virtio devices, this is done by the Linux driver by default (see the 'yes' for all devices); for other device types, this needs to be triggered by Linux user space (which implies that you can't simply go ahead and use a device, you always need to do some kind of setup).

The last four columns, PIM, PAM, POM and CHPIDs, deal with channel paths: An issue which is completely irrelevant for QEMU guests, but very interesting on real hardware. Just a quick overview:
  • PIM (path installed mask), PAM (path available mask) and POM (path operational mask) are all 8 bit values corresponding bit-by-bit to one of eight channel paths. If the corresponding bit is set in all of the three masks, the channel path can be used for I/O.
  • CHPIDs are channel path identifiers: Each channel path has an id from 0-0xff, which is unique combined with the relevant cssid. For virtio devices, there's only one valid channel path with the id 02.
Channel paths on real hardware correspond (simplistically spoken) to the connections between the actual mainframe and e.g. the storage server containing the disk devices. The setup is usually redundant, and load balancing and failover is possible between the paths. The channel paths are not per-device; usually, a set of devices shares a set of channel paths. For a virtual setup like a QEMU guest with only virtio devices, there is no real equivalent for this. Therefore, there's only a virtual channel path which does nothing but satisfy the architecture. This means that the output of the following command is not very interesting for our example guest:
[root@localhost ~]# lschp
CHPID  Vary  Cfg.  Type  Cmg  Shared  PCHID
============================================
0.00   1     -     32    -    -       -   

  • CHPID is the channel-path identifier of the form xx.nn, where xx is the cssid and nn the chpid. This is always 0.00 on virtio-only guests.
  • Vary means that the channel path is online to the guest. You don't want to change this for the only path.
  • Type is the channel-path type. 0x32 is a reserved type for the virtio virtual channel path.
All of this does not explain how Linux actually talks to those devices (and how QEMU emulates this). I'll get to that in a future post.
1. A VT220 compatible console via SCLP is automatically generated.
2. Which, in hindsight, turned out to not be the cleverest choice - see the confusing output of lscss.