Install and configure DRDB for network filesystem replication on Debian 8

Replication of a filesystem for security reasons: if one node fails, the other node is accessible.
To replicate a filesystem to another company headquarter, so each emplyee has access to his data locally and not through a public network. But if he goes to the other headquarter he has all his data, and again he can access locally.

As you can imagine, this kind of system is often used to build filesystems for cluster environment.

We have chosen to implement the solution with DRDB. It's main porpouse is (as other systems like this) High Availability and Disaster Recovery for file systems.

We implement the solution with Debian 8, but it should work also on Ubuntu.

Prerequisites

Before we start, here are the prerequisites:

At least 2 Debian servers.
Debian is installed as a minimal installation (not necessary at all if you know what are you doing on production systems) recommended guide https://www.howtoforge.com/tutorial/debian-8-jessie-minimal-server/
At least 2 Linux disks in each server: /dev/sda for the linux installation, /dev/sdb for the DRDB installation.

ATTENTION!!!: During installation, all data on disk /dev/sdb will be destroyed, so don't work on a disk with data inside.

DRBD Installation

In our example, I will use two nodes, wich are:

192.168.152.100 mysql1.local.vm
192.168.152.110 mysql2.local.vm

On all nodes, modify the file /etc/hosts as follows:

127.0.0.1 localhost
192.168.152.100 mysql1.local.vm mysql1
192.168.152.110 mysql2.local.vm mysql2

# The following lines are desirable for IPv6 capable hosts
::1 localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

Then execute the following commands to install DRDB:

apt-get update
apt-get -y upgrade
apt-get install drbd-utils

Configuration Primary/Secondary - Disaster Recovery

The main configuration file is /etc/drbd.conf wich looks like the following one:

include "drbd.d/global_common.conf";
include "drbd.d/*.res";

By convention, /etc/drbd.d/global_common.conf contains the global and common sections of the DRBD configuration, whereas the .res files contain one resource in each section.

In our example we do a minimal setup that replicates the data on the two nodes. On each node do the following modification:

We'll start editing the file /etc/drbd.d/global_common.conf modify the default line from

global {
usage-count yes;
# minor-count dialog-refresh disable-ip-verification
}
...
net {
protocol C;
# protocol timeout max-epoch-size max-buffers unplug-watermark
# connect-int ping-int sndbuf-size rcvbuf-size ko-count
# allow-two-primaries cram-hmac-alg shared-secret after-sb-0pri
# after-sb-1pri after-sb-2pri always-asbp rr-conflict
# ping-timeout data-integrity-alg tcp-cork on-congestion
# congestion-fill congestion-extents csums-alg verify-alg
# use-rle
}
...

Now we'll create the configuration file /etc/drbd.d/r0.res for our resource. Create the file on all nodes and add this inside:

resource r0 {
  on mysql1.local.vm {
    device    /dev/drbd1;
    disk      /dev/sdb;
    address   192.168.152.100:7789;
    meta-disk internal;
  }
  on mysql2.local.vm {
    device    /dev/drbd1;
    disk      /dev/sdb;
    address   192.168.152.110:7789;
    meta-disk internal;
  }
}

What we have done until now is the following:

You "opt in" to be included in DRBD’s usage statistics with usage-count parameter.
Resources are configured to use fully synchronous replication with Protocol Cunless explicitly specified otherwise.
Our cluster consists of two nodes: mysql1 and mysql2.
We have a resource arbitrarily named r0 which uses /dev/sdb as the lower-level device, and is configured with internal meta data.
The resource uses TCP port 7789 for its network connections, and binds to the IP addresses 192.168.152.100 and 192.168.152.110 respectively.

On all nodes initialize the metadata with the following command:

drbdadm create-md r0

You should see something like this:

--== Thank you for participating in the global usage survey ==--
The server's response is:

you are the 2963th user to install this version
initializing activity log
NOT initializing bitmap
Writing meta data...
New drbd meta data block successfully created.
success

Next, we enable the resource and initialize the first replication run, only on first node, it should start replicating:

drbdadm up r0
drbdadm primary --force r0

To check if all is working well you can check the file /proc/drbd on both nodes and you shold see something like this:

Mysql1

root@mysql1:# cat /proc/drbd
version: 8.4.3 (api:1/proto:86-101)
srcversion: 1A9F77B1CA5FF92235C2213
1: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----
ns:54624 nr:0 dw:0 dr:55536 al:0 bm:3 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:5188060
[>....................] sync'ed: 1.1% (5064/5116)Mfinish: 0:17:21 speed: 4,964 (4,964) K/sec

Mysql2

root@mysql2:# cat /proc/drbd
version: 8.4.3 (api:1/proto:86-101)
srcversion: 1A9F77B1CA5FF92235C2213
1: cs:SyncTarget ro:Secondary/Primary ds:Inconsistent/UpToDate C r-----
ns:0 nr:17496 dw:17496 dr:0 al:0 bm:1 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:5225188
[>....................] sync'ed: 0.4% (5100/5116)Mfinish: 0:29:41 speed: 2,916 (2,916) want: 5,160 K/sec

During the build phase you can notice the UpToDate/Inconsistent, it's correct because this is the first sync of data.

After the filsystem is synced this shloud change to UpToDate/UpToDate like in the following log:

root@mysql1:/home/sysop# cat /proc/drbd
version: 8.4.3 (api:1/proto:86-101)
srcversion: 1A9F77B1CA5FF92235C2213
1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
ns:5242684 nr:0 dw:0 dr:5243596 al:0 bm:320 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

Now we have a new block device, called /dev/drbd1 that we can format with our preferred filesystem type. For example, if we want to format it in ext4 and mount it on /var/www we can simply do:

root@mysql1:/home/sysop# mkfs.ext4 /dev/drbd1
mke2fs 1.42.12 (29-Aug-2014)
Creazione del file system con 1310671 4k blocchi e 327680 inode
Etichetta del file system=ab3e18c9-e8cb-42c8-977a-ab79bdb18aea
Backup del superblocco salvati nei blocchi:
32768, 98304, 163840, 229376, 294912, 819200, 884736
Allocating group tables: fatto
Scrittura delle tavole degli inode: fatto
Creating journal (32768 blocks): fatto
Scrittura delle informazioni dei super-blocchi e dell'accounting del file system: fatto

Then we can mount our filesystem:

mkdir /var/www
mount /dev/drbd1 /var/www

The /var/www directory is now mounted through the drbd system.

In this scenario, with a Primary/Secondary configuration, we have implemented a disaster recovery system.

In this case, if you try to mount the filesystem on the second node, you'll get an error:

root@mysql2:~# mount /dev/drbd1 /var/www/
mount: /dev/drbd1 is write-protected, mounting read-only
mount: mount /dev/drbd1 on /var/www failed: Tipo di supporto errato

This is normal and expected because of our configuration.

Now we can simluate a fail on mysql1, so power it down with poweroff command.

On mysql2 we can see what heappens:

Oct 5 13:52:14 mysql2 kernel: [13458.629215] drbd r0: PingAck did not arrive in time.
Oct 5 13:52:14 mysql2 kernel: [13458.629587] drbd r0: peer( Primary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown )
Oct 5 13:52:14 mysql2 kernel: [13458.629919] drbd r0: asender terminated
Oct 5 13:52:14 mysql2 kernel: [13458.629921] drbd r0: Terminating drbd_a_r0
Oct 5 13:52:14 mysql2 kernel: [13458.630028] drbd r0: Connection closed
Oct 5 13:52:14 mysql2 kernel: [13458.630035] drbd r0: conn( NetworkFailure -> Unconnected )
Oct 5 13:52:14 mysql2 kernel: [13458.630035] drbd r0: receiver terminated
Oct 5 13:52:14 mysql2 kernel: [13458.630036] drbd r0: Restarting receiver thread
Oct 5 13:52:14 mysql2 kernel: [13458.630037] drbd r0: receiver (re)started
Oct 5 13:52:14 mysql2 kernel: [13458.630041] drbd r0: conn( Unconnected -> WFConnection )

The mysql2 node detects that mysql1 is dead, and if we check the /proc/drbd:

root@mysql2:~# cat /proc/drbd
version: 8.4.3 (api:1/proto:86-101)
srcversion: 1A9F77B1CA5FF92235C2213
1: cs:WFConnection ro:Secondary/Unknown ds:UpToDate/DUnknown C r-----
ns:0 nr:5457236 dw:5457236 dr:0 al:0 bm:320 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

we can see the status changed from Secondary/primary to Secondary/unknown. So now we do the trick, promoting mysql2 as primary. On mysql2 simply run:

drbdadm primary r0

let's check again /proc/drbd and see the magic...

root@mysql2:~# cat /proc/drbd
version: 8.4.3 (api:1/proto:86-101)
srcversion: 1A9F77B1CA5FF92235C2213
1: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r-----
ns:0 nr:5457236 dw:5457236 dr:912 al:0 bm:320 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

As you can see our node now is primary and can be regularly mounted.

root@mysql2:~# mount /dev/drbd1 /var/www/
root@mysql2:~# df -h
File system Dim. Usati Dispon. Uso% Montato su
/dev/sda1 9,3G 1,4G 7,5G 16% /
udev 10M 0 10M 0% /dev
tmpfs 97M 4,6M 92M 5% /run
tmpfs 241M 0 241M 0% /dev/shm
tmpfs 5,0M 4,0K 5,0M 1% /run/lock
tmpfs 241M 0 241M 0% /sys/fs/cgroup
/dev/drbd1 4,8G 10M 4,6G 1% /var/www

Now we assume we wanna take up mysql1 again, if we start now we'll come to split brain, infact you can check log and see these errors:

Oct 5 14:26:04 mysql1 kernel: [ 7.760588] drbd r0: conn( StandAlone -> Unconnected )
Oct 5 14:26:04 mysql1 kernel: [ 7.760599] drbd r0: Starting receiver thread (from drbd_w_r0 [458])
Oct 5 14:26:04 mysql1 drbdadm[435]: adjust net: r0
Oct 5 14:26:04 mysql1 drbdadm[435]: ]
Oct 5 14:26:04 mysql1 kernel: [ 7.769318] drbd r0: receiver (re)started
Oct 5 14:26:04 mysql1 kernel: [ 7.769327] drbd r0: conn( Unconnected -> WFConnection )
Oct 5 14:26:04 mysql1 /etc/init.d/mysql[485]: MySQL PID not found, pid_file detected/guessed: /var/run/mysqld/mysqld.pid
Oct 5 14:26:04 mysql1 acpid: starting up with netlink and the input layer
Oct 5 14:26:04 mysql1 acpid: 1 rule loaded
Oct 5 14:26:04 mysql1 acpid: waiting for events: event logging is off
Oct 5 14:26:05 mysql1 kernel: [ 8.270578] drbd r0: Handshake successful: Agreed network protocol version 101
Oct 5 14:26:05 mysql1 kernel: [ 8.270581] drbd r0: Agreed to support TRIM on protocol level
Oct 5 14:26:05 mysql1 kernel: [ 8.270770] drbd r0: conn( WFConnection -> WFReportParams )
Oct 5 14:26:05 mysql1 kernel: [ 8.270771] drbd r0: Starting asender thread (from drbd_r_r0 [461])
Oct 5 14:26:05 mysql1 kernel: [ 8.272594] block drbd1: drbd_sync_handshake:
Oct 5 14:26:05 mysql1 kernel: [ 8.272597] block drbd1: self 242B364F4A5B9C68:525CC995A3CFBA2B:44A1DE193A6C6701:0000000000000004 bits:64463 flags:0
Oct 5 14:26:05 mysql1 kernel: [ 8.272598] block drbd1: peer 6903F6042F95F5FF:525CC995A3CFBA2A:44A1DE193A6C6700:0000000000000004 bits:4 flags:0
Oct 5 14:26:05 mysql1 kernel: [ 8.272599] block drbd1: uuid_compare()=100 by rule 90
Oct 5 14:26:05 mysql1 kernel: [ 8.272601] block drbd1: helper command: /sbin/drbdadm initial-split-brain minor-1
Oct 5 14:26:05 mysql1 kernel: [ 8.272692] drbd r0: meta connection shut down by peer.
Oct 5 14:26:05 mysql1 kernel: [ 8.272720] drbd r0: conn( WFReportParams -> NetworkFailure )
Oct 5 14:26:05 mysql1 kernel: [ 8.272722] drbd r0: asender terminated
Oct 5 14:26:05 mysql1 kernel: [ 8.272722] drbd r0: Terminating drbd_a_r0
Oct 5 14:26:05 mysql1 kernel: [ 8.279158] block drbd1: helper command: /sbin/drbdadm initial-split-brain minor-1 exit code 0 (0x0)
Oct 5 14:26:05 mysql1 kernel: [ 8.279173] block drbd1: Split-Brain detected but unresolved, dropping connection!
Oct 5 14:26:05 mysql1 kernel: [ 8.279197] block drbd1: helper command: /sbin/drbdadm split-brain minor-1
Oct 5 14:26:05 mysql1 kernel: [ 8.286125] block drbd1: helper command: /sbin/drbdadm split-brain minor-1 exit code 0 (0x0)
Oct 5 14:26:05 mysql1 kernel: [ 8.286144] drbd r0: conn( NetworkFailure -> Disconnecting )
Oct 5 14:26:05 mysql1 kernel: [ 8.286146] drbd r0: error receiving ReportState, e: -5 l: 0!
Oct 5 14:26:05 mysql1 kernel: [ 8.287009] drbd r0: Connection closed
Oct 5 14:26:05 mysql1 kernel: [ 8.287017] drbd r0: conn( Disconnecting -> StandAlone )
Oct 5 14:26:05 mysql1 kernel: [ 8.287018] drbd r0: receiver terminated
Oct 5 14:26:05 mysql1 kernel: [ 8.287019] drbd r0: Terminating drbd_r_r0

This is because, now we have two primary nodes, that is not possibile in a Primary/Secondary configuration. So assuming fresh data id on mysql2, we have to demote mysql1 to Secondary.

So on mysql1 run:

root@mysql1:~# drbdadm secondary r0
root@mysql1:~# drbdadm connect --discard-my-data r0

On mysql2 instead, wich is the split brain survivor, we have to execute:

root@mysql2:~# drbdadm connect r0

now you can check and see that all is rebuilding correctly, but now mysql1 is secondary and mysql2 is primary.

root@mysql1:~# cat /proc/drbd
version: 8.4.3 (api:1/proto:86-101)
srcversion: 1A9F77B1CA5FF92235C2213
1: cs:SyncTarget ro:Secondary/Primary ds:Inconsistent/UpToDate C r-----
ns:0 nr:28224 dw:28224 dr:0 al:0 bm:1 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:229628
[=>..................] sync'ed: 11.2% (229628/257852)K
finish: 0:01:04 speed: 3,528 (3,528) want: 6,600 K/sec