Configuring raid -- md0p2 or md1?
|
Author | Content |
---|---|
penguinist Aug 05, 2016 11:18 AM EDT |
Hello fellow LXers. Someone here certainly has the answer to my question of the day. I'm upgrading my archival storage which will consist of a raid5 assembly of four drives organized as two equal size partitions. In the past I've done this in two different ways: Alternative 1: Partition the raw drives each with two equal sized partitions and then set up two independent raid5 arrays on these partitions using: mdadm --create /dev/md0 --level=5 --raid-devices=4 /dev/sd[bcde]1 mdadm --create /dev/md1 --level=5 --raid-devices=4 /dev/sd[bcde]2 Alternative 2: Partition the raw drives each with one large partition and then set up a single raid5 array on them using: mdadm --create /dev/md0 --level=5 --raid-devices=4 /dev/sd[bcde]1 So the question is, does one of these alternatives have advantages over the other or are they equivalent in the end? I've "googled" this question using my favorite search engine and have come up with little opinion on this one way or the other. |
jdixon Aug 05, 2016 12:20 PM EDT |
> So the question is, does one of these alternatives have advantages over the other or are they equivalent in the end? That's a really good question. Unless there's a speed difference involved (which you'd have to test by benchmarking both configurations), I'd think they would be equivalent. I'd be please to be corrected by a more knowledgeable reader. As with you a search doesn't turn up much information. |
mbaehrlxer Aug 06, 2016 1:37 PM EDT |
smaller raid partitions complete faster if you ever need to replace a disk and resync. normally, it doesn't matter, but if you need to reboot during the resync then any incomplete resync will start over. if you have multiple raid partitions, then any partition that completed the resync before the reboot will be fine, and you'll only need to start over with the ones not yet completed. in my case i have a single btrfs partition that is spread over 4 md raid devices on two 3TB disks. greetings, eMBee. |
penguinist Aug 09, 2016 10:23 AM EDT |
We have some test results. First please permit me to complain about how long it takes to run performance experiments on a 12TB array. Basically you get one experiment per day when you account for the time it takes to build a raid5 array. Ok I have that off my shoulders, now let's get to the results. First test: Partition four 4TB drives with single partitions and build a 12 TB array, then partition that into two equal 6TB parts and format them with ext4. This gives us a /dev/md0p1 and a /dev/md0p2. For those interested in the details, here are the commands we used to accomplish this: parted /dev/sdb mklabel gpt mkpart primary 1M 4001GB parted /dev/sdc mklabel gpt mkpart primary 1M 4001GB parted /dev/sdd mklabel gpt mkpart primary 1M 4001GB parted /dev/sde mklabel gpt mkpart primary 1M 4001GB mdadm --create /dev/md0 --level=5 --raid-devices=4 /dev/sd[bcde]1 # now we have a big array and we proceed to partition it into two pieces parted /dev/md0 mklabel gpt parted /dev/md0 mkpart primary 3072s 11720659967s parted /dev/md0 mkpart primary 11720659968s 23441313791s mkfs.ext4 /dev/md0p1 mkfs.ext4 /dev/md0p2 mount /dev/md0p1 /data mount /dev/md0p2 /back Running the bonnie++ performance benchmark on this gives us these results: bonnie++ -b -d /data -s 128G -n 0 -m after_init -u root Version 1.96 ------Sequential Output------ --Sequential Input- --Random- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP after_init 128G 808 96 262866 40 150010 25 3354 83 422299 38 130.1 15 Latency 14473us 354ms 756ms 27122us 115ms 464ms So we have these results: Write at 263 MB/s, Read at 422 MB/s Second test: Partition four 4TB drives each with two partitions and build two 6TB arrays from them, then format those arrays with ext4. This gives us a /dev/md0 and a /dev/md1. For those interested in the details, here are the commands we used to accomplish this: parted /dev/sdb --script mklabel gpt mkpart primary 2048s 3907018751s mkpart primary 3907018752s 7814033407s unit s p parted /dev/sdc --script mklabel gpt mkpart primary 2048s 3907018751s mkpart primary 3907018752s 7814033407s unit s p parted /dev/sdd --script mklabel gpt mkpart primary 2048s 3907018751s mkpart primary 3907018752s 7814033407s unit s p parted /dev/sde --script mklabel gpt mkpart primary 2048s 3907018751s mkpart primary 3907018752s 7814033407s unit s p mdadm --create /dev/md0 --level=5 --raid-devices=4 /dev/sd[bcde]1 mdadm --create /dev/md1 --level=5 --raid-devices=4 /dev/sd[bcde]2 mkfs.ext4 /dev/md0 mkfs.ext4 /dev/md1 mount /dev/md0 /data mount /dev/md1 /back Running the bonnie++ performance benchmark on this gives us these results: Version 1.96 ------Sequential Output------ --Sequential Input- --Random- Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP after_init 128G 1018 95 280598 42 156776 25 4427 87 480970 35 146.1 8 Latency 13782us 254ms 672ms 18787us 111ms 517ms So we have these results: Write at 281 MB/s, Read at 481 MB/s Conclusions: The smaller array configuration (prepartition the raw drives) wins the performance race by a noticeable margin resulting in 6.7% improvement with writes and 13.9% improvement with reads. Notice that in both tests, the mdadm defaults were used with no attempt to optimize or fine tune the parameters. (I'll leave those tests to the next investigator.) |
Posting in this forum is limited to members of the group: [Editors, MEMBERS, SITEADMINS.]
Becoming a member of LXer is easy and free. Join Us!