Amador Pahim

Virtualization

oVirt: Power Management for Nested Hypervisors

apahim

Introduction

A helpful resource for people out there testing and/or developing oVirt is the “Nested KVM”. You can enable a base system to expose the virtualization extension to the VM, then VM will have the “vmx” or “svm” extension available, what makes VM eligible to act as an oVirt Hypervisor. Running Hypervisors in VMs is not intended for production (I guess). To do so, you have to:

Check if nested KVM is enabled (should be Y):

[[email protected] ~]# cat /sys/module/kvm_intel/parameters/nested
N

As per above result, it’s disabled, to enable:

[[email protected] ~]# echo "options kvm-intel nested=1" > /etc/modprobe.d/kvm-intel.conf

And reboot your system. For AMD, it’s default enabled (should be 1):

[[email protected] ~]# cat /sys/module/kvm_amd/parameters/nested
1

Focusing on Intel from now on, you can adapt the instructions bellow for AMD. Now VMs should be started with the ‘vmx’ cpu flag. Notice the “+vmx” into the line bellow:

[[email protected] ~]# qemu-system-x86_64 -smp 2 -m 2048 \
-cpu Haswell-noTSX,+abm,+pdpe1gb,+rdrand,+f16c,+osxsave,\
+pdcm,+xtpr,+tm2,+est,+smx,+vmx,+ds_cpl,+monitor,+dtes64,\
+pbe,+tm,+ht,+ss,+acpi,+ds,+vme \
/var/lib/libvirt/images/ovirt-hyper.qcow2 \
--enable-kvm \
-device virtio-net-pci,netdev=net0,mac=52:54:00:47:74:ee \
-netdev tap,id=net0,script=/usr/local/share/qemu/qemu-ifup.sh

Where “qemu-ifup.sh” is an executable and contains:

#!/bin/sh
set -x

switch=br0

if [ -n "$1" ]
  then
    /usr/bin/sudo /usr/sbin/tunctl -u `whoami` -t $1
    /usr/bin/sudo /sbin/ip link set $1 up
    sleep 0.5s
    /usr/bin/sudo /usr/sbin/brctl addif $switch $1
    exit 0
  else
    echo "Error: no interface specified"
    exit 1
fi

br0 is the bridge that eth0 on my base system (Fedora 22) is attached to. So, after install the OS in this VM (CentOS 7.1 in my case), it can be added to the oVirt Admin. Portal as a Hypervisor. I have two of them, resulting in the image below:

01

Notice in the image above that hardware information confirms it is a qemu VM. But the focus of this article is the fact that Power Management is not enabled for those Hypervisors. That’s illustrated by the orange exclamation besides the green arrow.

As well known, Power Management devices are devices present on physical servers, like DRAC on Dell machines and iLO on HP machines, or others systems, like Cisco UCS blade. oVirt does support a large number of Power Management devices:

02

Power Management of hypervisors is a key feature for oVirt, as it’s used to assure Hypervisor is down in a High Availability operation. I mean, a VM that was running on a Hypervisor that looses contact to the oVirt Engine will be started on other available hypervisor, as long as oVirt can make sure the original Hypervisor is down, and oVirt does that by using the Power Management device.

But for our test environment, using nested KVM, Power Management was not an option, since qemu did not have a Power Management device, keeping Power Management and High Availability out of our tests scope. This is now about to change.

Corey Minyard, a member of qemu community, sent a series of patches introducing an ipmi device to qemu: http://lists.nongnu.org/archive/html/qemu-devel/2014-12/msg01990.html. While the code is in review process and it is not merged to the qemu repository, we still can benefit of it by using the github Corey’s repository containing the bits.

Compiling qemu with support for ipmi device

On the base system (Fedora 22 in my case), clone the repository:

[[email protected] ~]# git clone https://github.com/cminyard/qemu.git qemu-ipmi

Enter the directory, checkout the branch containing the ipmi code and compile qemu (you will probably need to install zlib-devel, glib2-devel, pixman-devel and “Development Tools”):

[[email protected] ~]# cd qemu-ipmi/
[[email protected] qemu-ipmi]# git checkout for-review-2015-06-08
[[email protected] qemu-ipmi]# ./configure --prefix=/usr/local/share/qemu --target-list=x86_64-softmmu
[[email protected] qemu-ipmi]# make
[[email protected] qemu-ipmi]# make install

Now whe have a /usr/local/share/qemu/bin/qemu-system-x86_64 binary, which supports ipmi devices. All we have to do is to use that binary, adding the proper options in qemu command line:

[[email protected] ~]# /usr/local/share/qemu/bin/qemu-system-x86_64 -smp 2 -m 2048 \
-cpu Haswell-noTSX,+abm,+pdpe1gb,+rdrand,\
+f16c,+osxsave,+pdcm,+xtpr,+tm2,+est,+smx,\
+vmx,+ds_cpl,+monitor,+dtes64,+pbe,+tm,+ht,\
+ss,+acpi,+ds,+vme \
/var/lib/libvirt/images/ovirt-hyper.qcow2 --enable-kvm \
-device virtio-net-pci,netdev=net0,mac=52:54:00:47:74:ee \
-netdev tap,id=net0,script=/usr/local/share/qemu/qemu-ifup.sh \
-chardev socket,id=ipmi0,host=localhost,port=9002,reconnect=10 \
-device ipmi-bmc-extern,id=bmc1,chardev=ipmi0 \
-device isa-ipmi-kcs,bmc=bmc1

But we are not going to run the VM manually. Open-IPMI will run the VM when the “power on” command is issued by the ipmi client.

Configuring Open-IPMI in the base system

The base system will provide the IP/Port to be used as the VM ipmi device. We have first to install ‘OpenIPMI-lanserv’ package:

[[email protected] ~]# dnf install OpenIPMI-lanserv

And configure the config files:

/etc/ipmi/lan.conf:

name "ipmisim1"
set_working_mc 0x20
startlan 1
addr :: 9001
priv_limit admin
allowed_auths_callback none md2 md5 straight
allowed_auths_user none md2 md5 straight
allowed_auths_operator none md2 md5 straight
allowed_auths_admin none md2 md5 straight
guid a123456789abcdefa123456789abcdef
lan_config_program "./ipmi_sim_lancontrol eth1"
endlan
serial 15 localhost 9002 codec VM
startcmd "/usr/local/share/qemu/bin/qemu-system-x86_64 -smp 2 -m 2048 -cpu Haswell-noTSX,+abm,+pdpe1gb,+rdrand,+f16c,+osxsave,+pdcm,+xtpr,+tm2,+est,+smx,+vmx,+ds_cpl,+monitor,+dtes64,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme /var/lib/libvirt/images/ovirt-hyper.qcow2 --enable-kvm -device virtio-net-pci,netdev=net0,mac=52:54:00:47:74:ee -netdev tap,id=net0,script=/usr/local/share/qemu/qemu-ifup.sh -chardev socket,id=ipmi0,host=localhost,port=9002,reconnect=10 -device ipmi-bmc-extern,id=bmc1,chardev=ipmi0 -device isa-ipmi-kcs,bmc=bmc1"
startnow false
user 1 true "" "test" user 10 none md2 md5 straight
user 2 true "ipmiusr" "test" admin 10 none md2 md5 straight

Where:

/etc/ipmi/ipmisim1.emu:

mc_setbmc 0x20
mc_add 0x20 0 no-device-sdrs 0x23 9 8 0x9f 0x1291 0xf02 persist_sdr
sel_enable 0x20 1000 0x0a
sensor_add 0x20 0 1 0x01 0x01
sensor_set_value 0x20 0 1 0x60 0
sensor_set_threshold 0x20 0 1 settable 111000 0xa0 0x90 0x70 00 00 00
sensor_set_event_support 0x20 0 1 enable scanning per-state \
000111111000000 000111111000000 \
000111111000000 000111111000000
mc_enable 0x20

Now, let’s run ipmi_sim command:

[[email protected] ~]# ipmi_sim -c /etc/ipmi/lan.conf
IPMI Simulator version 1.0.13
# This is an example simulation setup for ipmi_sim. It creates a single
# management controller as a BMC. That will have the standard watchdog
# sensor and we add a temperature sensor.
...
...
# Turn on the BMC
mc_enable 0x20
>

IPMI client tests

In a new terminal (since the previous is running ipmi_sim command), it is time to test the IPMI device using ipmitool:

[[email protected] ~]# dnf install ipmitool
[[email protected] ~]# ipmitool -I lanplus -H 192.168.122.254 -U ipmiusr -p 9001 power status Password: Chassis Power is off 

Where 192.168.122.254 is the base system IP (Fedora 22), ipmiusr is the user defined in lan.conf file and 9001 is the entry point for the VM IPMI device. The password is test.

Now let’s start the (VM) system:

[[email protected] ~]# ipmitool -I lanplus -H 192.168.122.254 -U ipmiusr -p 9001 power on
Password:
Chassis Power Control: Up/On
[[email protected] ~]# ipmitool -I lanplus -H 192.168.122.254 -U ipmiusr -p 9001 power status Password: Chassis Power is on 

The power on command will actually run the VM using the command in startcmd, on lan.conf file. You can check with ps aux | grep qemu. Nice, huh?

Adding a second VM

After install the new VM, all we need to do is to create a new lan.conf file to run a new instance of ipmi_sim command:

[[email protected] ~]# cp /etc/ipmi/lan.conf /etc/ipmi/lan2.conf

Edit the lan2.conf file, like below:

name "ipmisim1"
set_working_mc 0x20
startlan 1
addr :: 9004
priv_limit admin
allowed_auths_callback none md2 md5 straight
allowed_auths_user none md2 md5 straight
allowed_auths_operator none md2 md5 straight
allowed_auths_admin none md2 md5 straight
guid a123456789abcdefa123456789abcdef
lan_config_program "./ipmi_sim_lancontrol eth1"
endlan
serial 15 localhost 9005 codec VM
startcmd "/usr/local/share/qemu/bin/qemu-system-x86_64 -smp 2 -m 2048 -cpu Haswell-noTSX,+abm,+pdpe1gb,+rdrand,+f16c,+osxsave,+pdcm,+xtpr,+tm2,+est,+smx,+vmx,+ds_cpl,+monitor,+dtes64,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme /var/lib/libvirt/images/ovirt-hyper2.qcow2 --enable-kvm -device virtio-net-pci,netdev=net0,mac=52:54:00:47:74:a0 -netdev tap,id=net0,script=/usr/local/share/qemu/qemu-ifup.sh -chardev socket,id=ipmi0,host=localhost,port=9005,reconnect=10 -device ipmi-bmc-extern,id=bmc1,chardev=ipmi0 -device isa-ipmi-kcs,bmc=bmc1"
startnow false
user 1 true "" "test" user 10 none md2 md5 straight
user 2 true "ipmiusr" "test" admin 10 none md2 md5 straight

Now run the new ipmi_sim instance, using the new file:

[[email protected] ~]# ipmi_sim -c /etc/ipmi/lan2.conf
IPMI Simulator version 1.0.13
# This is an example simulation setup for ipmi_sim. It creates a single
# management controller as a BMC. That will have the standard watchdog
# sensor and we add a temperature sensor.
...
...
# Turn on the BMC
mc_enable 0x20
>

IPMI client tests

Again, let’s test the ipmi for the new VM, using port 9004:

[[email protected] ~]# dnf install ipmitool
[[email protected] ~]# ipmitool -I lanplus -H 192.168.122.254 -U ipmiusr -p 9004 power status Password: Chassis Power is off 

Starting the (VM) system:

[[email protected] ~]# ipmitool -I lanplus -H 192.168.122.254 -U ipmiusr -p 9004 power on
Password:
Chassis Power Control: Up/On
[[email protected] ~]# ipmitool -I lanplus -H 192.168.122.254 -U ipmiusr -p 9004 power status Password: Chassis Power is on 

Using systemd for ipmi_sim

After test ipmi_sim, we can rely on systemd to control the daemons. I’m not intended to talk about systemd here, but here the service files:

[[email protected] ~]# cat /usr/lib/systemd/system/ovirt_ipmi1.service
[Unit]
Description=oVirt IPMI Guest Hypervisor 1
[Service]
WorkingDirectory=/root
Type=simple
ExecStart=/usr/bin/ipmi_sim -n -c /etc/ipmi/lan.conf
KillMode=process
[Install]
WantedBy=multi-user.target

[[email protected] ~]# cat /usr/lib/systemd/system/ovirt_ipmi2.service
[Unit]
Description=oVirt IPMI Guest Hypervisor 2
[Service]
WorkingDirectory=/root
Type=simple ExecStart=/usr/bin/ipmi_sim -n -c /etc/ipmi/lan2.conf
KillMode=process
[Install]
WantedBy=multi-user.target 

Thanks to Germano Veit Michel, who kindly provided the service files.

Power Management configuration in Admin Portal

With our 2 nested Hypervisors running, back to the oVirt Administration Portal, let’s configure the Power management for our Hypervisors.

Click in Hosts tab, select the ovirt-hyper host and click Edit. Go to Power Management menu, check Enable Power Management option, click in the Add Fence Agent plus button. In Edit Fence Agent window, select ipmilan in Type and fill the options like the image bellow:

03

Notice we have to inform ipport in Options, 9001 for the first VM. 192.168.122.254 is the IP address of the base system (Fedora 22). Click Test and the result is expected to be power on. Click OK and repeat the procedure for the second hypervisor, here called ovirt-hyper2:

04

For the ovirt-hyper2, the ipport port is 9004, as configured in lan2.conf.

Conclusion

05

Now the orange exclamations are gone, we have a working Power Management device for our nested hypervisors and we can have a more complete test/development environment.
I’d love to read your comments.
Cya.

Tags:

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top