Tuesday, November 27, 2012

Citrix Xenserver 6.1 + glusterfs test on vmare workstation 9

To organize my servers online and to seperate my different services I need several virtual machines. As a system admin I also like as much as possible control over my systems. So I want a redundant pool setup without spending too much money.
The idea is to rent 2 hardware servers for this and create a 2 node XenServer pool with glusterfs to replicate the storage to the 2 nodes and have it available via NFS. That way I can have live migration and also quickly start the virtual machines on the other node if one of the nodes goes down.

So I did a test-setup of my idea on vmware workstation 9 (since it supports intel-vt emulation). I created 2 virtual machines (xen1 and xen2) with 2 host-only networks (choose centos 64 bit, go along with the defaults (pick your own mem & disk size ofcourse), change the processor setting and activate "Virtualize Intel VT-x", redirect your dvd to the xenserver iso, remove floppy, printer and soundcard (you could even delete the usb)). The first host-only network (192.168.8.0/24) I use for admin and connecting (it is natted via my iptables on my fedora 16) and the second (192.168.9.0/24) I use for glusterfs and is not routed.
Xenserver is freely available from http://www.citrix.com/downloads/xenserver.html. In order to download you have to create an plain account on the citrix website. You can also download the xencenter admin tool but it's automatically available after xenserver install.

Next thing after xenserver is installed (the install is pretty straightforward so won't explain it here in detail, enough info via google if you need it) is to download and install glusterfs. You can download the glusterfs, glusterfs-fuse, glusterfs-server pkgs from the website http://download.gluster.org/pub/gluster/glusterfs/3.3/3.3.1/CentOS/epel-5/i386/ Scp the pkgs to both nodes and install them

[root@xen1 ~]# yum localinstall ./glusterfs-*rpm --nogpg -y 
Loaded plugins: fastestmirror
Setting up Local Package Process
Examining ./glusterfs-3.3.1-1.el5.i386.rpm: glusterfs-3.3.1-1.el5.i386
Marking ./glusterfs-3.3.1-1.el5.i386.rpm to be installed
Loading mirror speeds from cached hostfile
Examining ./glusterfs-fuse-3.3.1-1.el5.i386.rpm: glusterfs-fuse-3.3.1-1.el5.i386
Marking ./glusterfs-fuse-3.3.1-1.el5.i386.rpm to be installed
Examining ./glusterfs-server-3.3.1-1.el5.i386.rpm: glusterfs-server-3.3.1-1.el5.i386
Marking ./glusterfs-server-3.3.1-1.el5.i386.rpm to be installed
Resolving Dependencies
--> Running transaction check
---> Package glusterfs.i386 0:3.3.1-1.el5 set to be updated
---> Package glusterfs-fuse.i386 0:3.3.1-1.el5 set to be updated
---> Package glusterfs-server.i386 0:3.3.1-1.el5 set to be updated
--> Finished Dependency Resolution

...

Installed:
  glusterfs.i386 0:3.3.1-1.el5                  glusterfs-fuse.i386 0:3.3.1-1.el5                  glusterfs-server.i386 0:3.3.1-1.el5                 

Complete!

Don't forget to enable the services at boot time. And to start them (which I won't do here since I will reboot after I have cleared the local storage).
[root@xen1 ~]# chkconfig --list|grep gluster
glusterd        0:off 1:off 2:off 3:off 4:off 5:off 6:off
glusterfsd      0:off 1:off 2:off 3:off 4:off 5:off 6:off
[root@xen1 ~]# chkconfig glusterd  on
[root@xen1 ~]# chkconfig glusterfsd on
Since my VMs only have one disk (typical like rented hw servers) and that xenserver takes all of the disk at installation I needed some space for my glusterfs bricks. It is possible not to let xenserver use the disk for guest storage at installation, but then you have to install via and automated installation file (See appendix C of the installation doc)
You can also delete this local guest storage afterwards (on both nodes): find the sr-uuid for the local storage, find the pbd, unplug the pbd, remove the sr and clean the lvm config.
[root@xen1 ~]# xe sr-list host=xen1 name-label=Local\ storage 
uuid ( RO)                : 46416544-9eba-7734-efc5-800b2deba55b
          name-label ( RW): Local storage
    name-description ( RW): 
                host ( RO): xen1
                type ( RO): lvm
        content-type ( RO): user


[root@xen1 ~]# xe pbd-list sr-uuid=46416544-9eba-7734-efc5-800b2deba55b
uuid ( RO)                  : 41748cf2-c4ad-0610-9489-256e3e1d483c
             host-uuid ( RO): 7d306eb1-3ba0-4491-937a-373a809a0488
               sr-uuid ( RO): 46416544-9eba-7734-efc5-800b2deba55b
         device-config (MRO): device: /dev/sda3
    currently-attached ( RO): true


[root@xen1 ~]# xe pbd-unplug uuid=41748cf2-c4ad-0610-9489-256e3e1d483c

[root@xen1 ~]# xe sr-forget uuid=46416544-9eba-7734-efc5-800b2deba55b

[root@xen1 ~]# vgdisplay -C
  VG                                                 #PV #LV #SN Attr   VSize  VFree 
  VG_XenStorage-46416544-9eba-7734-efc5-800b2deba55b   1   1   0 wz--n- 11.99G 11.98G

[root@xen1 ~]# vgremove VG_XenStorage-46416544-9eba-7734-efc5-800b2deba55b
Do you really want to remove volume group "VG_XenStorage-46416544-9eba-7734-efc5-800b2deba55b" containing 1 logical volumes? [y/n]: y
Do you really want to remove active logical volume MGT? [y/n]: y
  Logical volume "MGT" successfully removed
  Volume group "VG_XenStorage-46416544-9eba-7734-efc5-800b2deba55b" successfully removed

[root@xen1 ~]# pvdisplay -C
  PV         VG   Fmt  Attr PSize  PFree 
  /dev/sda3       lvm2 a-   12.00G 12.00G

[root@xen1 ~]# pvremove /dev/sda3
  Labels on physical volume "/dev/sda3" successfully wiped
[root@xen1 ~]# reboot
I reboot here so I'm sure the nodes come up without the local storage and my glusterfs services are started.

Next step is to install XenCenter. For those unfamiliar with XenServer: XenCenter is the GUI admin tool and runs on windows and neet the .NET framwork. To install it, open your browser in your windows and point it to the admin ip of 1 of the 2 xenhosts. Download the msi and start the install. Most things you can do via the CLI also but I know some people like a GUI too :-). In my setup used the GUI to configure the secundary ip (for my dedicated glusterfs network) on both nodes and also to create a pool with the 2 nodes.

Time to setup to glusterfs. Glusterfs uses xattr so best filesystem to use is ext4 or xfs. XenServer includes the kernel modules, but not the tools. The ext4 tools can be installed via the Centos base repo. (On both nodes) Edit the repo file and enable (enabled=1) the base repo and then install the e4fsprogs.

[root@xen1 ~]# vi /etc/yum.repos.d/CentOS-Base.repo 
[root@xen1 ~]# yum install e4fsprogs -y
We can now (again on both nodes) create a new filesystem on the partition and mount it (I have ommited the output of the cmds):
[root@xen1 ~]# mkfs.ext4 -m 0 -j /dev/sda3
[root@xen1 ~]# mkdir -p /export/xen1-vol0
[root@xen1 ~]# echo "/dev/sda3   /export/xen1-vol0   ext4  defaults 1 2" >> /etc/fstab 
[root@xen1 ~]# mount -a
On the second node I mount the filesystem on /export/xen2-vol0. So my mount points match the pattern replica_host-volname. When all in place we can create the glusterfs replica. If it is the first volume we create we need to let both glusterfs nodes know of each other:
[root@xen1 ~]# gluster peer probe 192.168.9.201
Performing this command on xen1 automatically adds also 192.168.9.200 (xen1) as peer on xen2. If you are using hostnames instead of ips and also wanna see the hostnames in the glusterfs config and output you need to run the peer probe command with the hostname on BOTH nodes. If you use ips, running on 1 nodes is enough.
Create the volume as replica to the 2 nodes (run on 1 node):
[root@xen1 ~]# gluster volume create vol0 replica 2 192.168.9.200:/export/xen1-vol0 192.168.9.201:/export/xen2-vol0
We will use this volume via NFS as a storage repo on XenServer. We will mount to localhost. This way all hosts in the pool will mount to localhost and since both hosts have a replica this will work. Glusterfs values consistency over performance. So when all replicas are up, a write is returned as OK when all replicas have commited. In case a host2 fails the guest can be started on host1 since his replica will have the same state as when host2 went down. When host2 comes up again glusterfs will start the autoheal so you can safely start guests on node2 again (although the healing process can slow down the startup of the guests).
Glusterfs has built-in NFS (version 3 tcp, no udp) and CIFS. XenServer needs NFS on port 2049 in order to add it as a SR. So we set the port and then start the volume (volumes are started automatically when the hosts start up):
[root@xen1 ~]# gluster volume set vol0 nfs.port 2049
[root@xen1 ~]# gluster volume start vol0
You can check the status of the volume and the nfs port (it can be that the gluster services need to be restarted for the nfs port):
[root@xen1 ~]# gluster volume status vol0
Status of volume: vol0
Gluster process      Port Online Pid
------------------------------------------------------------------------------
Brick 192.168.9.200:/export/xen1-vol0   24009 Y 7024
Brick 192.168.9.201:/export/xen2-vol0   24009 Y 7076
NFS Server on localhost     2049 Y 7030
Self-heal Daemon on localhost    N/A Y 7042
NFS Server on 192.168.9.201    2049 Y 7082
Self-heal Daemon on 192.168.9.201   N/A Y 7088

So only thing left to do is open XenCenter and add a NFS SR. As described earlier, use localhost. All volumes are exported under /. You can change this with the gluster volume set command, but for this I refer to the excellent docs on the glusterfs website. So in my setup my nfs target is localhost:/vol0

[root@xen1 ~]# xe sr-list type=nfs
uuid ( RO)                : 8629a755-9aee-322f-62b9-a43630c9d9d1
          name-label ( RW): gluster-vol0
    name-description ( RW): NFS SR [localhost:/vol0]
                host ( RO): 
                type ( RO): nfs
        content-type ( RO): 
Good luck !!