mdadm
Article by
on May 19, 2014, last modified on November 12, 2016Setup RAID1 on Existing OS
My approach to RAID is that I just want a redundant drive in case one fails and I want to use it for storage and not as an OS. So, this setup will be for creating a RAID1 array on an existing Linux OS.
1. Find Your /dev's
First, you need to figure out what your devices are:
$ lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 70G 0 disk ├─sda1 8:1 0 50G 0 part ├─sda2 8:2 0 20G 0 part sdb 8:0 0 500G 0 disk sdc 8:0 0 500G 0 disk
Mine are sdb and sdc.
2. Create Partitions on Your Drives
Next, create partition tables:
$ fdisk /dev/sdb Command (m for help): n # then, hit enter a bunch of times Command (m for help): w
Then, do the same for sdc.
3. Create the RAID Drive
Next, create the RAID drive as /dev/md0 whatever device you want to call it:
$ mdadm --create /dev/md0 --level=mirror --raid-devices=2 /dev/sdb1 /dev/sdc1
4. Edit the mdadm.conf
Save the array details to mdadm.conf:
$ mdadm -Es | grep md0 >> /etc/mdadm/mdadm.conf
5. Update initramfs
$ update-initramfs -u
6. Create the Filesystem
$ mkfs.ext4 /dev/md0
7. Add the Drive to fstab
Add the drive to fstab so it will auto-mount on boot:
$ /dev/md0 /mnt/bup ext4 defaults,nobootwait,noatime 0 2
Change "/mnt/bup" to whatever path you want the drive to be mounted at. If you are unfamiliar with fstab then the options above should be sufficient. Otherwise, you can look at the mount manual page to see what other options are there.
8. Mount the Drive
$ mount -a
Recovery
Force Mount After Failure
If a drive fails, you reboot the PC, and it won't mount you can force it to mount via:
$ mdadm --manage /dev/md0 --run
You should never do this. You should replace the drive. However, in practice, there may be a need.
Adding a New Drive
If a drive fails, power off the machine and replace the drive. How to find which drive has failed? I'm using RAID 1 and I just boot with one drive and make sure this command fails:
$ mdadm -Ds /dev/md0 mdadm: md device /dev/md0 does not appear to be active.
After replacing the drive, on boot, you will need to re-partition the drive and add it to your array. Here is an example of RAID1 with sdb and sdc and sdc has failed:
$ fdisk /dev/sdc
Run through the prompts, it should be "n" for new then "p" for primary, and "1" for partition number, then hit "enter" a bunch of times, then it should prompt you with "Command (m for help):". Here you can type "p" to show the partition was created. Finally, hit "w" and "enter" to save the created partition.
Next, add the drive to the array:
$ mdadm /dev/md0 -a /dev/sdc
Finally, check the progress of re-syncing:
$ cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md0 : active raid1 sdc[2] sdb[0] 2930135360 blocks super 1.2 [2/1] [U_] [>....................] recovery = 0.5% (17077248/2930135360) finish=257.4min speed=188574K/sec unused devices: <none>
Testing Failure
To test your recovery process after you have setup your RAID array:
- add some data to your RAID drive
- poweroff the machine and unplug one of the drives
- poweron the machine and force mount:
$ mdadm --manage /dev/md0 --run
- add some more data to your RAID drive
- poweroff the machine and plug back in the other drive
- poweron the machine and sync the drives
- poweroff the machine and unplug the other drive that you never unplugged before
- poweron the machine and force mount;
$ mdadm --manage /dev/md0 --run
- check to ensure that at step 6 all the data you created at step 4 was synced.
Now, I need to double check that this actually works. When I did it the first time I had a bunch of trouble and ended up running mdadm -a /dev/md0 /dev/sdc.
Monitoring
You will likely want to be notified somehow if a drive fails. mdadm has good built-in support for emailing alerts. However, you may want to check it manually or may not be able to send email from your server (e.g. you're behind a ISP that blocks SMPT).
To check the status run:
$ mdadm -Ds /dev/md0
If everything is normal you will see:
State : clean
If a drive has failed you will see:
State : clean, degraded
It will also show which drive has failed:
Number Major Minor RaidDevice State 0 0 0 0 removed 2 8 32 1 active sync /dev/sdc
You can also get information from /proc/mdstat, or you can check a specific drive with "mdstat -E /dev/sdb".
This information may need updated periodically, from what I understand. There is what seems to be a good answer on Stack Exchange that shows a few tips. To check the disk while it's running you can run:
$ echo check > /sys/block/md0/md/sync_action
I have not tried this, but there is an additional reference by Thomas Krenn. I've also seen this:
$ /usr/share/mdadm/checkarray --cron --all --idle --quiet
There seems to be some good information here.
The person on the Stack Exchange answer also said that they run the following command in a cron job once a month:
$ ionice -c3 tar c /dir/of/raid/filesystem > /dev/null
The author says,
It’s not a thorough check of the drive itself, but it does force the system to periodically verify that (almost) every file can be read successfully off the disk. Yes, some files are going to be read out of memory cache instead of disk. But I figure that if the file is in memory cache, then it’s successfully been read off disk recently, or is about to be written to disk, and either of those operations will also uncover drive errors.
The author goes on to say that in the three years of using a RAID array it was this command that caught a bad drive, but warns that if you have a large RAID array it will take a long time, estimating 6 hrs per terabyte.
I also have not tested this cron idea.
Statistics
To get statistics such as drive status, re-syncing status, etc, you have a few ways:
$ mdadm -Ds
$ mdadm -D /dev/md0
$ cat /proc/mdstat
$ mdadm --detail /dev/md0
Resyncing
To resync a drive, I *think* you do:
$ umount /dev/md0 $ mdadm --stop /dev/md0 $ mdadm --assemble --run --force --update=resync /dev/md0 /dev/sdb1 /dev/sdc1 mdadm: /dev/md0 has been started with 2 drives.
But, it should re-sync automatically.
Reference: http://www.thomas-krenn.com/en/wiki/Mdadm_recovery_and_resync#Resync
Further Reading
- Thread: Setting up RAID 1... after install
- you don't need nodiratime if using noatime (link)
- Convert a single drive system to RAID
- same hard drive on the same controller, but different cylinder/head/sector
- 5 Tips To Speed Up Linux Software Raid Rebuilding And Re-syncing
- Mdadm Cheat Sheet
- Testing your software RAID: Be prepared
- Setting Up RAID using mdadm on Existing Drive
- Thread: MDADM -> which drive has failed?
- How to check mdadm raids while running?
- Articles by Zack Reed on mdadm
- #45 How do I know if my hard drive is failing?
- #40 UPS - Protect Your Files
- Mounting from a Live CD
- mdadm Recovery and Resync
- using foremost:
http://unix.stackexchange.com/a/65112
https://help.ubuntu.com/community/DataRecovery#Foremost - RAID Recovery
And, here is a jumble of links I found when trying to debug that I will keep for future reference:
- http://www.linuxquestions.org/questions/linux-newbie-8/cannot-add-replacement-drive-mdadm-not-large-enough-to-join-array-4175473258/
- http://realtechtalk.com/mdadm_devsdb1_not_large_enough_to_join_array_solution-1290-articles
- http://www.howtoforge.com/replacing_hard_disks_in_a_raid1_array
- http://markus.dresch.net/post/3917167813/changing-a-failed-disk-in-software-raid-1
- http://askubuntu.com/a/240069