Ubuntu cluster from scratch¶

Introduction¶

The main objective is to build a corosync cluster based on corosync 2.0, or what Andrew Beekhof calls option 3: Everyone talks to corosync 2.0

In order to clarify current possible corosync architectures, you may want to read this post from Andrew:

http://theclusterguy.clusterlabs.org/post/34604901720/pacemaker-and-cluster-filesystems

We want to do it on top of ubuntu (Current LTS is 12.04, precise). Current ubuntu cluster stack is based on option 2: Everyone talks to CMAN. This is the main source for Ubuntu cluster status: https://wiki.ubuntu.com/ClusterStack/Precise

While option 2 is the safest approach for a couple of years, we whant to build on option 3, as this is the future option, architecturaly superior to option 1 and option 2.

So this document is about building a option 2 cluster on top of ubuntu precise.

TODO: link main refefences of doc: http://clusterlabs.org/doc/

TODO: explain current problem with OCFS2

TODO: explain GFS2 cluster filesystem needs and issues

Cluster Components¶

TODO: intro an explain base components and extra components for gfs2

Main Cluster components¶

cluster-glue
resource-agents
libqb
corosync
pacemaker
dlm

Other requirements¶

kernel 3.4 (for dlm controld locking)
drbd
gfs2

Build¶

Important note: In order to prevent compile errors, you must start in a clean environment, with no previous versions of the components that you are building installed on your system. So be sure of making these checks before continuing:

Remove all deb packages installed related to cluster on your system: cman, corosync, pacemaker, libqb, cluster-glue, etc.
If you are compiling a previously installed package (as an example, you are compiling a git updated corosync and you have the previous corosync installed), the old includes and libs can make you have lots of compiling errors, because you are using old installed includes. Continuing with corosync example, you should remove $PREFIX/include/corosync before compiling new corosync version.

Preparing environment¶

Create a script to set your environ variables: $HOME/exports.sh

#!/bin/sh
export PREFIX=/opt/ha
export PKG_CONFIG_PATH=$PREFIX/lib/pkgconfig
export LCRSODIR=$PREFIX/libexec/lcrso
export LDFLAGS=-L$PREFIX/lib
export CPPFLAGS=-I$PREFIX/include
export CFLAGS=-I$PREFIX/include
export CLUSTER_USER=hacluster
export CLUSTER_GROUP=haclient

Before build components, do always: source $HOME/exports.sh

Create hacluster user and haclient group. This example uses uid and gid 120, but you can choose any free one:

sudo groupadd -g 120 haclient
sudo useradd -u 120 -g 120 -s /bin/false -d /usr/lib/heartbeat hacluster

Install needed development packages:

sudo aptitude install build-essential git mercurial
sudo aptitude install autogen automake libtool pkg-config groff autopoint bison bison-dev
sudo aptitude install libncurses-dev libreadline-dev
sudo aptitude install libaio-dev libglib2.0-dev libxml2-dev libbz2-dev uuid-dev
sudo aptitude install libnss3-dev libxslt-dev
# for gfs2-utils:
sudo aptitude install libblkid-dev check
# crmsh
sudo aptitude install python-lxml

cluster-glue¶

cluster-glue is the base glue for corosync and pacemaker. It sets development headers with variables about cluster environment, like cluster common paths.

As an example, creates the include file include/heartbeat/glue-config.h, with common definitions for components like corosync and pacemaker.

Build and install procedure:

hg clone http://hg.linux-ha.org/glue

./autogen.sh

./configure --prefix=$PREFIX  --with-daemon-user=${CLUSTER_USER} \
--with-daemon-group=${CLUSTER_GROUP} --enable-fatal-warnings=no \
--with-ocf-root=$PREFIX/usr/lib/ocf

make
sudo make install

libqb¶

clone and build:

git clone https://github.com/asalkeld/libqb.git

cd libqb
./autogen.sh
./configure --prefix=$PREFIX

make
sudo make install

resource-agents¶

clone and build:

git clone git://github.com/ClusterLabs/resource-agents
cd resource-agents
./autogen.sh && ./configure --prefix=$PREFIX
make
sudo make install

If we install future resource agents in system standar dir, we want it to be written in our OCF dir:

1	sudo ln -s /opt/ha/usr/lib/ocf /usr/lib/ocf

corosync¶

Clone, build and install:

git clone git://github.com/corosync/corosync.git
./autogen.sh
./configure --prefix=$PREFIX
make

Pacemaker¶

Clone, build and install:

git clone git://github.com/ClusterLabs/pacemaker.git
cd pacemaker
./autogen.sh
./configure --prefix=$PREFIX --without-cman \
    --without-heartbeat --with-corosync \
    --enable-fatal-warnings=no --with-lcrso-dir=$LCRSODIR
make
sudo make install

Configure script should give a result like this:

Version                  = 1.1.8 (Build: f94e1e4)
Features                 = libqb-logging libqb-ipc lha-fencing upstart systemd nagios  corosync-native

Prefix                   = /opt/ha
Executables              = /opt/ha/sbin
Man pages                = /opt/ha/share/man
Libraries                = /opt/ha/lib
Header files             = /opt/ha/include
Arch-independent files   = /opt/ha/share
State information        = /opt/ha/var
System configuration     = /opt/ha/etc
Corosync Plugins         = /opt/ha/lib

Use system LTDL          = yes

HA group name            = haclient
HA user name             = hacluster

CFLAGS                   = -I/opt/ha/include -I/opt/ha/include -I/opt/ha/include/heartbeat    -I/opt/ha/include   -I/opt/ha/include   -ggdb  -fgnu89-inline -fstack-protector-all -Wall -Waggregate-return -Wbad-function-cast -Wcast-align -Wdeclaration-after-statement -Wendif-labels -Wfloat-equal -Wformat=2 -Wformat-security -Wformat-nonliteral -Wmissing-prototypes -Wmissing-declarations -Wnested-externs -Wno-long-long -Wno-strict-aliasing -Wunused-but-set-variable -Wpointer-arith -Wstrict-prototypes -Wwrite-strings
Libraries                = -lcorosync_common -lplumb -lpils -lqb -lbz2 -lxslt -lxml2 -lc -lglib-2.0 -lglib-2.0 -luuid -lrt -ldl  -lglib-2.0   -lltdl -L/opt/ha/lib -lqb -ldl -lrt -lpthread
Stack Libraries          =   -L/opt/ha/lib -lqb -ldl -lrt -lpthread   -L/opt/ha/lib -lcpg   -L/opt/ha/lib -lcfg   -L/opt/ha/lib -lcmap   -L/opt/ha/lib -lquorum

crmsh¶

hg clone http://hg.savannah.nongnu.org/hgweb/crmsh/
cd crmsh
./autogen
./configure --prefix=$PREFIX
make
sudo make install

Cluster configuration¶

This is a fast and basic cluster configuration.

Put these environ variables at the end of your .bashrc (both your user and root):

export PATH=/opt/ha/sbin:$PATH
export MANPATH=/op/ha/share/man:$MANPATH
export PYTHONPATH=/opt/ha/lib/python2.7/site-packages

Copy corosync init script to init.d:

1	sudo cp /opt/ha/etc/init.d/corosync /etc/init.d/corosync

Create config file:

totem {
        version: 2

        crypto_cipher: none
        crypto_hash: none
        cluster_name: fiestaha
        interface {
                # Rings must be consecutively numbered, starting at 0.
                ringnumber: 0
                ttl: 1
                bindnetaddr: 192.168.132.11
                mcastaddr: 226.94.1.1
                mcastport: 5405
        }
}

logging {
        fileline: off
        to_stderr: yes
        to_logfile: no
        to_syslog: yes
        syslog_facility: local7
        # Log debug messages (very verbose). When in doubt, leave off.
        debug: off
        timestamp: on
        logger_subsys {
                subsys: QUORUM
                debug: off
        }
}

quorum {
        provider: corosync_votequorum
        expected_votes: 2
        two_node: 1
        wait_for_all: 0
}

Start cluster services¶

TODO: good rsyslog config

On /etc/rsyslog.d/50-debian-default:

local7.* /var/log/corosync.log
*.*;auth,authpriv.none;local7.none              -/var/log/syslog

Start cluster services:

sudo service corosync start
sudo service pacemaker start

Logs should be on /var/log/corosync.log, using rsyslog daemon.

We can see cluster config wiht crm: sudo -i crm configure show:

bercab@fiesta-ha1:/usr/src$ sudo -i crm configure show
node $id="193243328" fiesta-ha1
property $id="cib-bootstrap-options" \
      dc-version="1.1.8-f94e1e4" \
      cluster-infrastructure="corosync" \
      stonith-enabled="false"

TODO: show crm_mon

Corosync output: corosync -f:

Feb 14 09:08:23 notice  [MAIN  ] Corosync Cluster Engine ('2.3.0.3-6617'): started and ready to provide service.
Feb 14 09:08:23 info    [MAIN  ] Corosync built-in features: pie relro bindnow
Feb 14 09:08:23 notice  [TOTEM ] Initializing transport (UDP/IP Multicast).
Feb 14 09:08:23 notice  [TOTEM ] Initializing transmit/receive security (NSS) crypto: none hash: none
Feb 14 09:08:23 notice  [TOTEM ] The network interface [192.168.132.11] is now up.
Feb 14 09:08:23 notice  [SERV  ] Service engine loaded: corosync configuration map access [0]
Feb 14 09:08:23 notice  [SERV  ] Service engine loaded: corosync configuration service [1]
Feb 14 09:08:23 notice  [SERV  ] Service engine loaded: corosync cluster closed process group service v1.01 [2]
Feb 14 09:08:23 notice  [SERV  ] Service engine loaded: corosync profile loading service [4]
Feb 14 09:08:23 notice  [QUORUM] Using quorum provider corosync_votequorum
Feb 14 09:08:23 notice  [QUORUM] This node is within the primary component and will provide service.
Feb 14 09:08:23 notice  [QUORUM] Members[0]:
Feb 14 09:08:23 notice  [SERV  ] Service engine loaded: corosync vote quorum service v1.0 [5]
Feb 14 09:08:23 notice  [SERV  ] Service engine loaded: corosync cluster quorum service v0.1 [3]
Feb 14 09:08:23 notice  [TOTEM ] A processor joined or left the membership and a new membership (192.168.132.11:4) was formed.
Feb 14 09:08:23 notice  [QUORUM] Members[1]: 193243328
Feb 14 09:08:23 notice  [MAIN  ] Completed service synchronization, ready to provide service.

Part 2: Cluster Filesystems: DRBD, DLM, GFS2¶

DRBD 8.4¶

The main docs about drbd install are here:

http://www.drbd.org/users-guide/s-build-deb.html

Install dependences:

sudo aptitude install dpkg-dev fakeroot debhelper debconf-utils docbook-xml \
docbook-xsl dpatch flex xsltproc module-assistant

Clone git drbd project:

1	git clone git://git.drbd.org/drbd-8.4.git

Steps:

dpkg-buildpackage -rfakeroot -b -uc
sudo dpkg -i drbd8-utils_8.4.3-0_amd64.deb drbd8-module-source_8.4.3-0_all.deb
module-assistant auto-install drbd8

Write configuration /etc/drbd.d/ha.res:

resource ha0 {
net {
    protocol C;
    allow-two-primaries yes;
    after-sb-0pri discard-zero-changes;
    after-sb-1pri discard-secondary;
    after-sb-2pri disconnect;
    sndbuf-size 0;
    max-buffers 4000;
    max-epoch-size 4000;
}
startup {
    become-primary-on both;
}
disk {
    # size max-bio-bvecs on-io-error fencing disk-barrier disk-flushes
    # disk-drain md-flushes resync-rate resync-after al-extents
    # c-plan-ahead c-delay-target c-fill-target c-max-rate
    # c-min-rate disk-timeout
    fencing resource-only;
}
handlers {
    pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
    pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
    local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
    # fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
    # split-brain "/usr/lib/drbd/notify-split-brain.sh root";
    # out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root";
    before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k";
    after-resync-target "/usr/lib/drbd/unsnapshot-resync-target-lvm.sh";
    #fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
    #after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";
}
volume 0 {
    device    /dev/drbd0;
    disk      /dev/vgha/lvdrbd0;
    meta-disk internal;
}
volume 1  {
    device    /dev/drbd1;
    disk      /dev/vgha/lvdrbd1;
    meta-disk internal;
}
volume 2  {
    device    /dev/drbd2;
    disk      /dev/vgha/lvdrbd2;
    meta-disk internal;
}
on fiesta-ha1 {
    address   192.168.132.11:7789;
}
on fiesta-ha2 {
    address   192.168.132.12:7789;
}
}

Initial config:

sudo drbdadm create-md resource
sudo drbdadm up resource
sudo drbdadm primary --force resource

DRBD create a new volume¶

Main guide: http://www.drbd.org/users-guide/s-lvm-add-pv.html

As first step, we will put cluster on maintenance mode:

1	crm configure property maintenance-mode=true

On both nodes:

drbdadm adjust ha0 -d
drbdadm adjust ha0

It gives an error. Run:

drbdmeta 2 v08 /dev/vg00/lvdrbd2 internal create-md
drbdadm adjust ha0 -d

DLM¶

The dlm_controld code comes with no configure scripts. So in order to make things work, we have patched the Makefiles to work with our /opt/ha settings. See patch here: https://gist.github.com/bercab/4951303

Build and install:

git clone http://git.fedorahosted.org/git/dlm.git
cd dlm
wget https://gist.github.com/bercab/4951303/raw/71c9a5846099b72e2b208f300168c60b1421cf65/dlm.patch
patch -p1 < dlm.patch
make
sudo make install

Parameters can be passed to dlm_controld in two ways:

config file
command line.

In case (2), pacemaker primitives for controld can use arguments. However, we prefeer to put these parameters in /etc/dlm/dlm.conf:

enable_quorum_lockspace=0
log_debug=1
debug_logfile=1

# other disabled options, useful for debugging:
    #daemon_debug=1
    #foreground=1
    #enable_startup_fencing=0
    #enable_fencing=0
    #enable_quorum_fencing=0
    #fence_all dlm_stonith

note: Current docs abut using kernel dlm_controld are using -q primitive (or enable_quorum_fencing in dlm.conf). As we are using correct setting of two node cluster on our corosync.conf, we shouldn’t need this parameter.

For reference, use the manual page, there is much more info than googling:

man dlm_controld
man dlm.conf

Test dlm_controld¶

dlm_controld will be called by controld resource agent. But in order to test it, can be a good practice to try to run it on command line, and debug possible errors.

As previous steps, we should have dlm module installed, configfs mounted and corosync running. Corosync is needed because dlm will use corosync as message layer:

sudo modprobe configfs
sudo modprobe dlm
sudo mount -t configfs none /sys/kernel/config
sudo service corosync start

run dlm_controld --foreground --daemon_debug. Result should be like this:

root@fiesta-ha1:~# dlm_controld  --foreground --daemon_debug
config file log_debug = 1 cli_set 0 use 1
config file debug_logfile = 1 cli_set 0 use 1
dlm_controld 4.0.0 started
our_nodeid 193243328
found /dev/misc/dlm-control minor 57
found /dev/misc/dlm-monitor minor 56
found /dev/misc/dlm_plock minor 55
/sys/kernel/config/dlm/cluster/comms: opendir failed: 2
/sys/kernel/config/dlm/cluster/spaces: opendir failed: 2
set log_debug 1
set recover_callbacks 1
cmap totem.cluster_name = 'fiestaha'
set cluster_name fiestaha
/dev/misc/dlm-monitor fd 10
cluster quorum 1 seq 4 nodes 1
cluster node 193243328 added seq 4
set_configfs_node 193243328 192.168.132.11 local 1
cpg_join dlm:controld ...
setup_cpg_daemon 12
dlm:controld conf 1 1 0 memb 193243328 join 193243328 left
fence work wait for cluster ringid
dlm:controld ring 193243328:4 1 memb 193243328
fence_in_progress_unknown 0 startup
receive_protocol 193243328 max 3.1.1.0 run 0.0.0.0
daemon node 193243328 prot max 0.0.0.0 run 0.0.0.0
daemon node 193243328 save max 3.1.1.0 run 0.0.0.0
set_protocol member_count 1 propose daemon 3.1.1 kernel 1.1.1
receive_protocol 193243328 max 3.1.1.0 run 3.1.1.0
daemon node 193243328 prot max 3.1.1.0 run 0.0.0.0
daemon node 193243328 save max 3.1.1.0 run 3.1.1.0
run protocol from nodeid 193243328
daemon run 3.1.1 max 3.1.1 kernel run 1.1.1 max 1.1.1
plocks 13
receive_protocol 193243328 max 3.1.1.0 run 3.1.1.0

At this time corosync should give something like this:

Feb 14 10:54:20 debug   [QUORUM] lib_init_fn: conn=0x7f286c87b4f0
Feb 14 10:54:20 debug   [QUORUM] got quorum_type request on 0x7f286c87b4f0
Feb 14 10:54:20 debug   [QUORUM] got trackstart request on 0x7f286c87b4f0
Feb 14 10:54:20 debug   [QUORUM] sending initial status to 0x7f286c87b4f0
Feb 14 10:54:20 debug   [QUORUM] sending quorum notification to 0x7f286c87b4f0, length = 52

TODO: start drbd and mount filesystem

We should be able to start drbd and mount filesystem:

service drbd start
mount -t gfs2 /dev/drbd1 /mnt/fiestacomm

On syslog, you should see something like this:

Feb 14 10:59:54 fiesta-ha1 dlm_controld[23971]: 70232 dlm_controld 4.0.0 started
Feb 14 11:06:12 fiesta-ha1 kernel: [70610.395327] GFS2: fsid=fiestaha:commfs: Trying to join cluster "lock_dlm", "fiestaha:commfs"
Feb 14 11:06:12 fiesta-ha1 kernel: [70610.395484] dlm: Using TCP for communications
Feb 14 11:06:12 fiesta-ha1 kernel: [70610.399774] dlm: commfs: dlm_recover 1
Feb 14 11:06:12 fiesta-ha1 kernel: [70610.399810] dlm: commfs: add member 193243328
Feb 14 11:06:12 fiesta-ha1 kernel: [70610.399812] dlm: commfs: dlm_recover_members 1 nodes
Feb 14 11:06:12 fiesta-ha1 kernel: [70610.399815] dlm: commfs: generation 1 slots 1 1:193243328
Feb 14 11:06:12 fiesta-ha1 kernel: [70610.399817] dlm: commfs: dlm_recover_directory
Feb 14 11:06:12 fiesta-ha1 kernel: [70610.399834] dlm: commfs: dlm_recover_directory 0 entries
Feb 14 11:06:12 fiesta-ha1 kernel: [70610.399837] dlm: commfs: dlm_callback_resume 0
Feb 14 11:06:12 fiesta-ha1 kernel: [70610.399853] dlm: commfs: dlm_recover 1 generation 1 done: 0 ms
Feb 14 11:06:12 fiesta-ha1 kernel: [70610.399877] dlm: commfs: joining the lockspace group...
Feb 14 11:06:12 fiesta-ha1 kernel: [70610.399878] dlm: commfs: group event done 0 0
Feb 14 11:06:12 fiesta-ha1 kernel: [70610.399878] dlm: commfs: join complete
Feb 14 11:06:12 fiesta-ha1 kernel: [70610.900053] GFS2: fsid=fiestaha:commfs: first mounter control generation 0
Feb 14 11:06:12 fiesta-ha1 kernel: [70610.900056] GFS2: fsid=fiestaha:commfs: Joined cluster. Now mounting FS...
Feb 14 11:06:12 fiesta-ha1 kernel: [70610.918241] GFS2: fsid=fiestaha:commfs.0: jid=0, already locked for use
Feb 14 11:06:12 fiesta-ha1 kernel: [70610.918243] GFS2: fsid=fiestaha:commfs.0: jid=0: Looking at journal...
Feb 14 11:06:12 fiesta-ha1 kernel: [70610.920294] GFS2: fsid=fiestaha:commfs.0: jid=0: Done
Feb 14 11:06:12 fiesta-ha1 kernel: [70610.920358] GFS2: fsid=fiestaha:commfs.0: jid=1: Trying to acquire journal lock...
Feb 14 11:06:12 fiesta-ha1 kernel: [70610.920556] GFS2: fsid=fiestaha:commfs.0: jid=1: Looking at journal...
Feb 14 11:06:12 fiesta-ha1 kernel: [70610.926916] GFS2: fsid=fiestaha:commfs.0: jid=1: Done
Feb 14 11:06:12 fiesta-ha1 kernel: [70610.926968] GFS2: fsid=fiestaha:commfs.0: first mount done, others may mount

GFS2-utils¶

Instalar gfs2-utils de git:

http://git.fedorahosted.org/git/gfs2-utils.git
./autogen.sh
./configure --prefix=$PREFIX
make
sudo make install

Prepare GFS2 Filesystems:

todo:: explain parameters of mkfs.gfs2

Create filesystem: sudo mkfs.gfs2 -p lock_dlm -j 2 -t fiestaha:commfs /dev/drbd1:

root@fiesta-ha1:~# mkfs.gfs2 -p lock_dlm -j 2 -t fiestaha:commfs /dev/drbd1
This will destroy any data on /dev/drbd1.
It appears to contain: data
Are you sure you want to proceed? [y/n]y
Device:                    /dev/drbd1
Block size:                4096
Device size:               1.00 GB (262127 blocks)
Filesystem size:           1.00 GB (262125 blocks)
Journals:                  2
Resource groups:           4
Locking protocol:          "lock_dlm"
Lock table:                "fiestaha:commfs"
UUID:                      1074a7a0-3498-4553-09f7-97e4b5d95def

Primitives:

primitive p_fscomm ocf:heartbeat:Filesystem \
    params device="/dev/drbd/by-res/ha0/0" \
    directory="/opt/fiestacomm" fstype="gfs2"
clone fscomm_clone p_fscomm
colocation fscomm_ondrbd inf: fscomm_clone ms_drbd_ha0:Master
order fscomm_after_dlm_drbd inf: ms_drbd_ha0:promote dlm_clone:start fscomm_clone:start

Ubuntu cluster from scratch¶

Introduction¶

Cluster Components¶

Main Cluster components¶

Other requirements¶

Build¶

Preparing environment¶

cluster-glue¶

libqb¶

resource-agents¶

corosync¶

Pacemaker¶

crmsh¶

Cluster configuration¶

Start cluster services¶

Part 2: Cluster Filesystems: DRBD, DLM, GFS2¶

DRBD 8.4¶

DRBD create a new volume¶

DLM¶

Test dlm_controld¶

GFS2-utils¶

Table Of Contents

Previous topic

This Page

Navigation

Ubuntu cluster from scratch¶

Introduction¶

Cluster Components¶

Main Cluster components¶

Other requirements¶

Build¶

Preparing environment¶

cluster-glue¶

libqb¶

resource-agents¶

corosync¶

Pacemaker¶

crmsh¶

Cluster configuration¶

Start cluster services¶

Part 2: Cluster Filesystems: DRBD, DLM, GFS2¶

DRBD 8.4¶

DRBD create a new volume¶

DLM¶

Test dlm_controld¶

GFS2-utils¶

Table Of Contents

Previous topic

This Page

Quick search

Navigation