Virtual device role tagging, better explained

Now that Nova’s device role tagging feature talked about in a previous blog post is getting some real world usage, I’m starting to realise that it’s woefully under-documented and folks are having some misconceptions about what it is and how to use it.

Let’s start with an example. You boot a VM with 3 network interfaces, each for a different purpose, each connected to a different virtual network:

nova boot --nic <public data nic spec> \
          --nic <private data nic spec> \
          --nic <management data nic spec> \
          --nic <Skylink uplink nic spec>
[...]

You SSH to your new VM, run ifconfig, and see:

eth0: flags=4163 mtu 1500
  ether 00:00:00:00:00:af txqueuelen 1000 (Ethernet)

eth1: flags=4163 mtu 1500
  inet 192.168.122.1 netmask 255.255.255.0 broadcast 192.168.122.255
  ether 00:00:00:00:00:01 txqueuelen 1000 (Ethernet)

eth2: flags=4163 mtu 1500
  ether 00:00:00:00:00:3d txqueuelen 1000 (Ethernet)

eth3: flags=4163 mtu 1500
  ether 00:00:00:00:00:5e txqueuelen 1000 (Ethernet)

Great, your 4 network interfaces are there. The second one has an IP address. Therefore, eth1 must be your management interface because that’s the only network you have DHCP running on. However, you can’t tell eth0, eth2, and eth3 apart and you don’t know which one is for public data, which one is for private data, and which one is for the robot apocalypse.

That’s an important point to reiterate: if you have a VM with multiple network interfaces, the order in which they appear in the guest OS does not necessarily reflect the order in which they were given in the server boot request. In our example, the management interface was given second to last, but ends up as the second interface in the guest OS, eth1.

So we’re back to our problem: how do we tell eth0, eth2 and eth3 apart – or more generally, how does the guest OS know which network interface is which if DHCP is not enabled for some of them? This is solved with device role tags.

Let’s go back to our example, and boot the same VM with device role tags applied to the network interfaces:

nova boot --nic <public data nic spec>,tag=public \
          --nic <private data nic spec>,tag=pvt \
          --nic <management nic spec>,tag=mgmt \
          --nic <Skylink uplink nic spec>
[...]

We haven’t tagged the Skynet uplink interface, this will be important a bit later.

Booting the VM with tags on the network interfaces lets us know which network interface is which because Nova transmits the tags to the guest operating system. It does so in 2 ways.

The first way is for the guest to query Nova’s metadata API. Let’s SSH into our example VM and query the metadata API with curl:

$ curl http://169.254.169.254/openstack/latest/meta_data.json

This will give us a big JSON document. We’re looking for the following section:

"devices": [
  {
    "type": "nic",
    "bus": "pci",
    "address": "00:01.0",
    "mac": "00:00:00:00:00:5e",
    "tags": ["public"]
  },
  {
    "type": "nic",
    "bus": "pci",
    "address": "00:02.0",
    "mac": "00:00:00:00:00:01",
    "tags": ["mgmt"]
  },
  {
    "type": "nic",
    "bus": "pci",
    "address": "00:03.0",
    "mac": "00:00:00:00:00:af",
    "tags": ["pvt"]
  }
]

Each element in the devices array corresponds to one of the network interfaces that we’ve tagged when we booted the VM. Each device element contains our tag, but also other information about the device, such as PCI and MAC addresses. Using those, we can cross-reference with the output of ifconfig and figure out which network interface is which. For instance, we know that the network interface tagged with public has the 00:00:00:00:00:5e MAC address. Therefore, eth3 is the public data interface. Similarly, the interface tagged with pvt has the 00:00:00:00:00:af MAC address. Therefore, eth0 is the private data interface.

You’ve noticed by now that there are only 3 devices in the metadata we’ve received from the metadata API. Remember how we didn’t tag the Skynet uplink interface? Devices that aren’t tagged don’t appear in the metadata. The array is called devices but it should have been more accurately called tagged_devices. It makes sense to only include tagged devices since every other piece of information is already known to the guest OS. Let’s pretend we include the Skynet uplink interface in the devices array:

{
  "type": "nic",
  "bus": "pci",
  "address": "00:03.0",
  "mac": "00:00:00:00:00:3d"
}

There is nothing here that we can’t already find out with lspci or ifconfig, and it doesn’t help us in any way.

If the metadata API is not available, the config drive can be used. The JSON document returned by our previous curl command can also be found at openstack/latest/meta_data.json on the config drive.

Volumes (the –block-device parameter to the nova boot command) can be tagged in much the same way as network interfaces. The device tagging metadata is slightly different than for network interfaces, but the idea is the same: show the guest OS the device tag, along with other information it can use to figure out which disk the tag applies to. For example, we can boot a VM with two volumes:

nova boot --block-device <catpix volume>,tag=important \
          --block-device <database or whatever>,tag=db \
[...]

Our devices sections would then look something like this:

"devices": [
  {
    "type": "disk",
    "bus": "ide",
    "address": "0:1",
    "tags": ["db"]
  },
  {
    "type": "disk",
    "bus": "ide",
    "address": "1:0",
    "tags": ["important"]
  }
]

I hope this makes virtual device role tagging clearer. I’m hoping to merge tagged device attachment in Pike. If that happens, a blog post explaining how it works will follow.

Device tagging, new in Newton

Device tagging in OpenStack Nova is a mechanism to communicate to the guest OS the intended usage of virtual network interfaces and disks. For example, if an instance has two virtual network interfaces, one connected to a public network and the other to a private management network, the interfaces can be tagged with ‘pub’ and ‘pvt’ respectively. An application in the guest OS can fetch the tags and provision each interface accordingly.

With python-novaclient, in order to boot an instance with tagged devices, use the tag key in the –nic and –block-device arguments to the nova boot command. For example:

$ nova boot --flavor 1 --image cirros \
--nic net-id=149f5beb-2be5-4be6-9f0e-5290049945ab,tag=pub \
--nic net-id=bda05031-7651-41bd-825c-29ba89a4f08b,tag=pvt \
--block-device \
  id=e44d2fee-f3df-4a2c-9e2b-e016855b5522,\
  source=volume,dest=volume,bootindex=1,tag=db \
--block-device \
  id=6dab970b-3777-4ace-84ee-82829041084e,\
  source=volume,dest=volume,bootindex=1,tag=cache \
device-tagging-guest

You can also take a look at the full REST API reference.

As a reminder, an instance can access metadata in two ways. The first is to curl http://169.254.169.254/openstack/latest/meta_data.json. The second is to look on the configdrive under openstack/latest/meta_data.json (if the configdrive is enabled). In the above example, a devices section will appear in the metadata, looking something like this:

"devices": [
  {
    "type": "nic",
    "bus": "pci",
    "address": "00:01.0",
    "mac": "d5:d9:b3:0d:c8:b0",
    "tags": ["pub"]
  },
  {
    "type": "nic",
    "bus": "pci",
    "address": "00:02.0",
    "mac": "df:8b:d6:58:9b:b1",
    "tags": ["pvt"]
  },
  {
    "type": "disk",
    "bus": "ide",
    "address": "0:1",
    "tags": ["db"]
  },
  {
    "type": "disk",
    "bus": "ide",
    "address": "1:0",
    "tags": ["cache"]
  }
]

In the Ocata cycle we plan on implementing attaching tagged interfaces and volumes.

If you’re interested in diving deeper, you can take a look at the full spec as it was approved in Newton, and if you’re really crazy you can search for the review topic in Gerrit.