Portrait of Martijn

CloudStack Single Server on Ubuntu with KVM

30 Aug 2013

This guide documents the steps I did for my installation of CloudStack in a single-server configuration with KVM on Ubuntu 12.04.

[Update: You probably want the updated CloudStack 4.4 Single Server on Ubuntu 14.04.1 with KVM instead.]

I've been testing out CloudStack this week. We're setting up a QA lab that uses it, and my goal was to make a single-server installation so I could start researching how to run some of our test scripts to it.

For a really good introduction to CloudStack, see Getting to Know Apache CloudStack from ApacheCon NA 2013.

CloudStack (and its documentation) appears mainly aimed at a multi-machine configuration with a separate management web service host, a separate NFS server, one or more hypervisor hosts, running CentOS and Xen, connected to a VLAN-capable switch. At the other end of the hardware spectrum there is DevCloud which gives you a CloudStack cluster in a VirtualBox VM, but that's with nested virtualisation and tiny VMs.

I want everything on a single Ubuntu server using a KVM hypervisor, connected to a simple single physical network, without VLANs. The documentation tells you about that configuration, but it's a little tricky to extract the specific OS-specific items and figure out the easiest installation order. I also made some steps easier to just copy&paste.

Inital sever configuration

The server is a SuperMicro server, with IPMI, and a single NIC configured on an unmanaged switch.

I started with fresh Ubuntu 12.04.2.

Set our hostname, and use the actual IP address, not Ubuntu's annoying 127.0.1.1:

IP=`ip addr show eth0 | grep 'inet ' | awk '{print $2}' | sed -e 's,/.*,,'`
sed -i -e "s/^127.0.1.1.*/$IP macro.lab.stalworthy.net macro/"  /etc/hosts
cat /etc/hosts
hostname --fqdn
dig `hostname --fqdn` @8.8.8.8

Also set a root password:

sudo passwd root

Next, configure a network bridge for its role as KVM hypervisor host, so that we don't have to interfere with networking half-way through the install:

apt-get install bridge-utils

cp /etc/network/interfaces /etc/network/interfaces.orig
cat >/etc/network/interfaces <<EOM
auto lo
iface lo inet loopback

auto eth0
iface eth0 inet manual

# Public network
auto cloudbr0
iface cloudbr0 inet static
    address 192.168.0.70
    netmask 255.255.255.0
    gateway 192.168.0.1
    dns-nameservers 192.168.0.1
    dns-domain lab.stalworthy.net
    bridge_ports eth0
    bridge_fd 5
    bridge_stp off
    bridge_maxwait 1

# Private network
auto cloudbr1
iface cloudbr1 inet manual
    bridge_ports none
    bridge_fd 5
    bridge_stp off
    bridge_maxwait 1
EOM

reboot

Prepare host for Management Interface

Now we can follow the Installation Guide. The latest version is 4.1.1, and it has binary Ubuntu packages (see 4.4.1. DEB package repository).

cat >/etc/apt/sources.list.d/cloudstack.list <<EOM
deb http://cloudstack.apt-get.eu/ubuntu precise 4.1
EOM
wget -O - http://cloudstack.apt-get.eu/release.asc|apt-key add -
apt-get update

Next section is 4.5.2. Prepare the Operating System. We've already sorted out the hostname, so just do openntpd:

apt-get install openntpd

Install the management interfaces (4.5.3.2. Install on Ubuntu).

apt-get install cloudstack-management

The management interface needs a mysql db (4.5.4.1. Install the Database on the Management Server Node):

apt-get install mysql-server

# mysql root password: mypasswd33

cat >>/etc/mysql/conf.d/cloudstack.cnf <<EOM
[mysqld]
innodb_rollback_on_timeout=1
innodb_lock_wait_timeout=600
max_connections=350
log-bin=mysql-bin
binlog-format = 'ROW'
EOM

service mysql restart

ufw allow mysql

# cloud user password: mypasswd11
# management_server_key: mypasswd44
# database_key: mypasswd55
cloudstack-setup-databases cloud:mypasswd11@localhost \
                --deploy-as=root:mypasswd33 \
                -e file \
                -m mypasswd44 \
                -k mypasswd55

Prepare Storage

Next, storage. 4.5.6.2. Using the Management Server as the NFS Server.

mkdir -p /export/primary /export/secondary

apt-get install nfs-kernel-server

Reconfigure NFS to use fixed ports. The documentation talks about fixed ports in /etc/sysconfig/nfs for CentOS, but no instructions for Ubuntu. I got inspiration from debian-users. I also added no_subtree_check to avoid warnings.

cat >>/etc/exports <<EOM
/export  *(rw,async,no_root_squash,no_subtree_check)
EOM

exportfs -a

IP=`ip addr show cloudbr0 | grep 'inet ' | awk '{print $2}' | sed -e 's,/.*,,'`
showmount -e $IP

Configure nfs-kernel-server:

cat > /tmp/nfs-kernel-server.diff <<EOM
--- /etc/default/nfs-kernel-server.orig 2013-08-22 16:36:39.401714906 +0100
+++ /etc/default/nfs-kernel-server  2013-08-22 16:40:05.517596927 +0100
@@ -10,7 +10,7 @@
 # a fixed port here using the --port option. For more information,
 # see rpc.mountd(8) or http://wiki.debian.org/SecuringNFS
 # To disable NFSv4 on the server, specify '--no-nfs-version 4' here
-RPCMOUNTDOPTS=--manage-gids
+RPCMOUNTDOPTS="-p 892 --manage-gids"

 # Do you want to start the svcgssd daemon? It is only required for Kerberos
 # exports. Valid alternatives are "yes" and "no"; the default is "no".
EOM
cp /etc/default/nfs-kernel-server /etc/default/nfs-kernel-server.orig
patch -p0 < /tmp/nfs-kernel-server.diff

Configure nfs-common:

cat > /tmp/nfs-common.diff <<EOM
--- /etc/default/nfs-common.orig   2013-08-22 16:38:09.107923240 +0100
+++ /etc/default/nfs-common 2013-08-22 16:39:11.834669815 +0100
@@ -3,14 +3,14 @@
 # for the NEED_ options are "yes" and "no".

 # Do you want to start the statd daemon? It is not needed for NFSv4.
-NEED_STATD=
+NEED_STATD=yes

 # Options for rpc.statd.
 #   Should rpc.statd listen on a specific port? This is especially useful
 #   when you have a port-based firewall. To use a fixed port, set this
 #   this variable to a statd argument like: "--port 4000 --outgoing-port 4001".
 #   For more information, see rpc.statd(8) or http://wiki.debian.org/SecuringNFS
-STATDOPTS=
+STATDOPTS="--port 662 --outgoing-port 2020"

 # Do you want to start the gssd daemon? It is required for Kerberos mounts.
 NEED_GSSD=
EOM
cp /etc/default/nfs-common /etc/default/nfs-common.orig
patch -p0 < /tmp/nfs-common.diff

cat >> /etc/modprobe.d/lockd.conf <<EOM
options lockd nlm_udpport=32769 nlm_tcpport=32803
EOM

apt-get install quota
cat >/tmp/quota.diff <<EOM
--- /etc/default/quota.orig 2013-08-22 16:45:24.364373773 +0100
+++ /etc/default/quota  2013-08-22 16:46:04.919072048 +0100
@@ -2,4 +2,4 @@
 run_warnquota=

 # Add options to rpc.rquotad here
-RPCRQUOTADOPTS=
+RPCRQUOTADOPTS="-p 875"
EOM
cp /etc/default/quota /etc/default/quota.orig
patch -p0 < /tmp/quota.diff

And reboot:

reboot

Next, mount the NFS volumes:

IP=`ip addr show cloudbr0 | grep 'inet ' | awk '{print $2}' | sed -e 's,/.*,,'`
showmount -e $IP

# note: added intr and vers=3
cat >>/etc/fstab <<EOM
$IP:/export/primary   /mnt/primary    nfs rsize=8192,wsize=8192,timeo=14,intr,vers=3,noauto  0   2
$IP:/export/secondary /mnt/secondary  nfs rsize=8192,wsize=8192,timeo=14,intr,vers=3,noauto  0   2
EOM

mkdir -p /mnt/primary /mnt/secondary
mount /mnt/primary
mount /mnt/secondary

And test:

touch /export/primary/mak-was-here.txt
ls /mnt/primary
rm /mnt/primary/mak-was-here.txt

And configure iptables:

NETWORK=192.168.0.0/24
iptables -A INPUT -s $NETWORK -m state --state NEW -p udp --dport 111 -j ACCEPT
iptables -A INPUT -s $NETWORK -m state --state NEW -p tcp --dport 111 -j ACCEPT
iptables -A INPUT -s $NETWORK -m state --state NEW -p tcp --dport 2049 -j ACCEPT
iptables -A INPUT -s $NETWORK -m state --state NEW -p tcp --dport 32803 -j ACCEPT
iptables -A INPUT -s $NETWORK -m state --state NEW -p udp --dport 32769 -j ACCEPT
iptables -A INPUT -s $NETWORK -m state --state NEW -p tcp --dport 892 -j ACCEPT
iptables -A INPUT -s $NETWORK -m state --state NEW -p udp --dport 892 -j ACCEPT
iptables -A INPUT -s $NETWORK -m state --state NEW -p tcp --dport 875 -j ACCEPT
iptables -A INPUT -s $NETWORK -m state --state NEW -p udp --dport 875 -j ACCEPT
iptables -A INPUT -s $NETWORK -m state --state NEW -p tcp --dport 662 -j ACCEPT
iptables -A INPUT -s $NETWORK -m state --state NEW -p udp --dport 662 -j ACCEPT                

apt-get install iptables-persistent

4.5.8. Prepare the System VM Template:

/usr/share/cloudstack-common/scripts/storage/secondary/cloud-install-sys-tmplt \
-m /mnt/secondary -u http://download.cloud.com/templates/acton/acton-systemvm-02062012.qcow2.bz2 \
-h kvm -s mypasswd44 -F

Prepare KVM Hypervisor

Now section 5 says to login. But first we have to prep the hypervisor, otherwise the "Add Host" step in the wizard will fail.

8.1.4. Install and configure the Agent:

apt-get install cloudstack-agent

8.1.5. Install and Configure libvirt.

apt-get install qemu-kvm

Patch libvirtd.conf:

cat > /tmp/libvirtd.conf.patch <<EOM
--- /etc/libvirt/libvirtd.conf.orig 2013-08-22 20:16:06.850572952 +0100
+++ /etc/libvirt/libvirtd.conf  2013-08-22 20:17:23.668041757 +0100
@@ -20,6 +20,7 @@
 #
 # This is enabled by default, uncomment this to disable it
 #listen_tls = 0
+listen_tls = 0

 # Listen for unencrypted TCP connections on the public TCP/IP port.
 # NB, must pass the --listen flag to the libvirtd process for this to
@@ -31,7 +32,7 @@
 #
 # This is disabled by default, uncomment this to enable it.
 #listen_tcp = 1
-
+listen_tcp=1


 # Override the port for accepting secure TLS connections
@@ -43,7 +44,7 @@
 # This can be a port number, or service name
 #
 #tcp_port = "16509"
-
+tcp_port = "16509"

 # Override the default configuration which binds to all network
 # interfaces. This can be a numeric IPv4/6 address, or hostname
@@ -58,6 +59,7 @@
 #
 # This is enabled by default, uncomment this to disable it
 #mdns_adv = 0
+mdns_adv = 0

 # Override the default mDNS advertizement name. This must be
 # unique on the immediate broadcast network.
@@ -144,6 +146,7 @@
 # use, always enable SASL and use the GSSAPI or DIGEST-MD5
 # mechanism in /etc/sasl2/libvirt.conf
 #auth_tcp = "sasl"
+auth_tcp = "none"

 # Change the authentication scheme for TLS sockets.
 #
EOM
cp /etc/libvirt/libvirtd.conf /etc/libvirt/libvirtd.conf.orig
patch -p0 < /tmp/libvirtd.conf.patch

Patch qemu.conf:

cat > /tmp/qemu.conf.diff <<EOM
--- /etc/libvirt/qemu.conf 2013-08-15 07:50:29.936289741 -0700
+++ /etc/libvirt/qemu.conf.new 2013-08-15 07:50:44.220032386 -0700
@@ -9,7 +9,7 @@
 # NB, strong recommendation to enable TLS + x509 certificate
 # verification when allowing public access
 #
-# vnc_listen = "0.0.0.0"
+vnc_listen = "0.0.0.0"

 # Enable this option to have VNC served over an automatically created
 # unix socket. This prevents unprivileged access from users on the
EOM
cp /etc/libvirt/qemu.conf /etc/libvirt/qemu.conf.orig
patch -p0 < /tmp/qemu.conf.diff

Patch libvirt-bin.conf:

# Doc error: there is no "exec /usr/sbin/libvirtd -d", they now use
# a variable
cp /etc/init/libvirt-bin.conf /etc/init/libvirt-bin.conf.orig
sed -i -e 's/libvirtd_opts="-d"/libvirtd_opts="-d -l"/' /etc/init/libvirt-bin.conf

service libvirt-bin restart

Disable AppArmor:

ln -s /etc/apparmor.d/usr.sbin.libvirtd /etc/apparmor.d/disable/
ln -s /etc/apparmor.d/usr.lib.libvirt.virt-aa-helper /etc/apparmor.d/disable/
apparmor_parser -R /etc/apparmor.d/usr.sbin.libvirtd
apparmor_parser -R /etc/apparmor.d/usr.lib.libvirt.virt-aa-helper

8.1.7. Configure the network bridges we've already done, specifically for our single-server setup.

8.1.9.2. Open ports in Ubuntu

ufw allow proto tcp from any to any port 22
ufw allow proto tcp from any to any port 1798
ufw allow proto tcp from any to any port 16509
ufw allow proto tcp from any to any port 5900:6100
ufw allow proto tcp from any to any port 49152:49216

Prevent "Cannot find 'pm-is-supported' in path" error in libvirtd.log:

apt-get install pm-utils 

Now reboot and testing NFS is still OK:

reboot
rpcinfo 192.168.0.70
showmount -e 192.168.0.70
mount /mnt/primary
mount /mnt/secondary

and make sure the UI and the agent are running (do this if you see 404s for /client/ and "The requested resource (/client/) is not available." errors):

/etc/init.d/cloudstack-management stop
/etc/init.d/tomcat6 stop
/etc/init.d/cloudstack-agent stop
ps -efl | grep java

/etc/init.d/cloudstack-management start
/etc/init.d/cloudstack-management status
/etc/init.d/cloudstack-agent start
/etc/init.d/cloudstack-agent status

Start using the UI

5.1.3. Logging In as the Root Administrator. Log in as http://$IP:8080/client with username admin and password password.

Choose "Continue with basic setup".

Change password. new password: "mypasswd66"

6.3.1. Basic Zone Configuration.

  • Add a new zone named zone1, DNS1 192.168.0.1 and Internal DNS 192.168.0.1.
  • Add a new pod named pod1, gateway 192.168.0.1, netmask 255.255.255.0, IP range 192.168.0.160-192.168.0.169.
  • Add a guest network, gateway 192.168.0.1, netmask 255.255.255.0, IP range 192.168.0.170-192.168.0.179.
  • Add a cluster named cluster1, Hypervisor KVM.
  • Add a host. IP address 192.168.0.70, user root, passsword mypasswd99.
  • Add primary storage: name primary1, protocol NFS, server 192.168.0.70, path /export/primary.
  • Add secondary storage: NFS server 192.168.0.70, path /export/secondary.
  • Hit Launch and pray.

If this fails, you need to look at /var/log/cloudstack/management/management-server.log for details on the error.

If you go to http://myhost:8080/client, and click on Infrastructure, you should see: 1 zone, 1 pod, 1 cluster, 1 host, 1 primary storage, 1 secondary storage, and 2 System VMs. If you don't, then you won't be able to create VMs, and will need to debug/fix things first.

  • If you get get a 404 on http://myhost:8080/client, stop the management server and tomcat6, kill any leftover java process, and restart it.
  • If there is no Host, try adding it again, and doublecheck IP address and credentials.
  • If there is no primary storage, check NFS, check /mnt/primary is mounted and working, and re-add it.
  • If there is no secondary storage, check NFS, check /mnt/secondary is mounted and working, and re-add it.
  • As you go fixing these things, don't forget to hit Refresh on the Infrastructure page.
  • You should see two system VMs: one for managing secondary storage, and one for managing console access.
  • If they are missing, CloudStack will try to re-create them every 30 seconds or so.
  • If it's not re-creating check the zone is enabled (Infrastructure -> Zones -> zone1 -> enable)
  • If it's trying but failing, then look at /var/log/cloudstack/management/management-server.log and /var/log/cloudstack/agent/agent.log.
  • Check the SSVM, templates, Secondary storage troubleshooting page for more tips.

Register an ISO: Templates -> Select View ISO, Register ISO.

  • name: ubuntu-12.04.2-server-amd64
  • description: ubuntu-12.04.2-server-amd64
  • URL http://releases.ubuntu.com/precise/ubuntu-12.04.2-server-amd64.iso
  • Zones: all zones
  • OS: Ubuntu 12.04 (64-bit)
  • Bootable, Public, Featured

then click OK. Then click on the ISO row in the list, and check the Status shows x% Downloaded. Wait for the download to complete.

Start an instance

Select Instances -> Add Instance, zone 1, Select ISO, choose the ubuntu-12.04.2-server-amd64, small instance, small disk, default security group, name test1. Network -> Select view Security Groups, edit default, add ingress rule for TCP port 22 0.0.0.0/0. Click on instances, wait for test1 to be in the Running state, click on it.

View Console

Click on the View Console icon. For me this results in an empty window, because of a pecularity in my environment: my MacBook is using a DNS server from a connected VPN, which for some yet-undiscovered reason fails lookups for *.realhostip.com. I do think this dependency on external DNS is an undesirable hack.

To work around this you need to determine the IP address of the Console System VM: either look it up under Infrastructure, or do a View Source of the blank window and look in the frame tag for something like https://192-168-0-173.realhostip.com/. I work around that by adding it to my /etc/hosts:

echo "192.168.0.173 192-168-0-173.realhostip.com" >> /etc/hosts

Sometimes that still fails with "Unable to start console session as connection is refused by the machine you are accessing". If so, check on the host where the vnc ports are listening:

netstat -an | egrep ':590.*LISTEN'

if that shows 127.0.0.1:5900 rather than 0.0.0.0:5900, then you missed the qemu.conf step above. Double-check, restart libvirt, and stop/start your guest:

/etc/init.d/libvirt-bin restart

If you can get to the console, install Ubuntu. Once done, use the instance's NICs tab to determine the IP address and try to ssh into it using the admin user you used:

ssh mak@192.168.0.175

Debugging System VMs

To debug System VMs, determine the link local IP address from the Infrastructure screen, then:

ssh -i /root/.ssh/id_rsa.cloud  -p 3922 root@169.254.1.74

If a system VM gets stuck in "Starting" state, try connecting a VNC client directly to the VNC ports on the hypervisor hosts; you may find the instance waiting for a root password or fsck. The easiest way to deal with this is to destroy the system VM and let cloudstack recreate (for the storage VMs, just wait. For the virtual router, start a new VM). You fist have to have the DB to change the instance state to Stopped:

mysql --user=root --password
use cloud;
select state from vm_instance where instance_name = 'v-2-VM';
update vm_instance set state='Stopped' where name='v-2-VM';

I consider this a bug: I should be able to get to the VM console from CloudStack's UI, and I should be able to force stop it even in Starting state. It'd be nice if there were buttons to "destroy and recreate this System VM".

Other Problems

  • Sometimes I see primary storage problems that go away after a reboot
  • I tried local storage and ran into problems: I could start System VMs, but not guests.
  • A couple of easy-of-use things:
    • the default UI session timeout is too short, and not configurable from the UI (edit session-timeout in /etc/cloudstack/management/web.xml)
    • the UI often doesn't autorefresh, and not all screens have Refresh buttons
    • I wish I could explicitly expunge instances rather than waiting for the expunge thread. Especially annoying when you're out of IP addresses. As a workaround, search for "expunge" in the Global Settings, and decrease the intervals, then restart the cloudstack-management service.
    • when you restart the cloudstack-management server, you have to log in again, which gets old

Initial Impression

It works. The UI is nice, when things go well. But when errors do inevitably happen you're back to mining logs for Java exceptions and fixing Linux issues from the command line.

It seems somewhat fragile; I ran into a steady stream of problems especially during setup or after rebooting up the system, and subsequent Googling found others hitting the same circumstances.

For managing a handful of VMs on a single host I don't think it's the best solution -- too complex and fragile. I'd like to think there is a simpler web UI that just does KVM with local LVM storage pools like virt-manager. Perhaps I should try oVirt, WebVirtMgr, ProxMox, Archipel, ConVirt Open Source, or some other KVM management tool. Or just stick to my virsh scripts.

For the purposes of a multi-machine QA lab that is up 24/7 hopefully it will prove to be effective.