How to create a software RAID-1 array with mdadm on Linux

Redundant Array of Independent Disks (RAID) is a storage technology that combines multiple hard disks into a single logical unit to provide fault-tolerance and/or improve disk I/O performance. Depending on how data is stored in an array of disks (e.g., with striping, mirroring, parity, or any combination thereof), different RAID levels are defined (e.g., RAID-0, RAID-1, RAID-5, etc). RAID can be implemented either in software or with a hardware RAID card. On modern Linux, basic software RAID functionality is available by default.

In this post, we'll discuss the software setup of a RAID-1 array (also known as a "mirroring" array), where identical data is written to the two devices that form the array. While it is possible to implement RAID-1 with partitions on a single physical hard drive (as with other RAID levels), it won't be of much use if that single hard drive fails. In fact, that's why most RAID levels normally use multiple physical drives to provide redundancy. In the event of any single drive failure, the virtual RAID block device should continue functioning without issues, and allow us to replace the faulty drive without significant production downtime and, more importantly, with no data loss. However, it does not replace the need to save periodic system backups in external storage.

Since the actual storage capacity (size) of a RAID-1 array is the size of the smallest drive, normally (if not always) you will find two identical physical drives in RAID-1 setup.

Installing mdadm on Linux

The tool that we are going to use to create, assemble, manage, and monitor our software RAID-1 is called mdadm (short for multiple disks admin). On Linux distros such as Fedora, CentOS, RHEL or Arch Linux, mdadm is available by default. On Debian-based distros, mdadm can be installed with aptitude or apt-get.

Fedora, CentOS or RHEL

As mdadm comes pre-installed, all you have to do is to start RAID monitoring service, and configure it to auto-start upon boot:

# systemctl start mdmonitor
# systemctl enable mdmonitor

For CentOS/RHEL 6, use these commands instead:

# service mdmonitor start
# chkconfig mdmonitor on

Debian, Ubuntu or Linux Mint

On Debian and its derivatives, mdadm can be installed with aptitude or apt-get:

# aptitude install mdadm

On Ubuntu, you will be asked to configure postfix MTA for sending out email notifications (as part of RAID monitoring). You can skip it for now.

On Debian, the installation will start with the following explanatory message to help us decide whether or not we are going to install the root filesystem on a RAID array. What we need to enter on the next screen will depend on this decision. Read it carefully:

Since we will not use our RAID-1 for the root filesystem, we will leave the answer blank:

When asked whether we want to start (reassemble) our array automatically during each boot, choose "Yes". Note that we will need to add an entry to the /etc/fstab file later in order for the array to be properly mounted during the boot process as well.

Partitioning Hard Drives

Now it's time to prepare the physical devices that will be used in our array. For this setup, I have plugged in two 8 GB USB drives that have been identified as /dev/sdb and /dev/sdc from dmesg output:

# dmesg | less
[   60.014863] sd 3:0:0:0: [sdb] 15826944 512-byte logical blocks: (8.10 GB/7.54 GiB)
[   75.066466] sd 4:0:0:0: [sdc] 15826944 512-byte logical blocks: (8.10 GB/7.54 GiB)

We will use fdisk to create a primary partition on each disk that will occupy its entire size. The following steps show how to perform this task on /dev/sdb, and assume that this drive hasn't been partitioned yet (otherwise, we can delete the existing partition(s) to start off with a clean disk):

# fdisk /dev/sdb

Press 'p' to print the current partition table:

(if one or more partitions are found, they can be deleted with 'd' option. Then 'w' option is used to apply the changes).

Since no partitions are found, we will create a new primary partition ['n'] as a primary partition ['p'], assign the partition number = ['1'] to it, and then indicate its size. You can press Enter key to accept the proposed default values, or enter a value of your choosing, as shown in the image below.

Now repeat the same process for /dev/sdc.

If we have two drives of different sizes, say 750 GB and 1 TB for example, we should create a primary partition of 750 GB on each of them, and use the remaining space on the bigger drive for another purpose, independent of the RAID array.

Create a RAID-1 Array

Once you are done with creating the primary partition on each drive, use the following command to create a RAID-1 array:

# mdadm -Cv /dev/md0 -l1 -n2 /dev/sdb1 /dev/sdc1

Where:

  • -Cv: creates an array and produce verbose output.
  • /dev/md0: is the name of the array.
  • -l1 (l as in "level"): indicates that this will be a RAID-1 array.
  • -n2: indicates that we will add two partitions to the array, namely /dev/sdb1 and /dev/sdc1.

The above command is equivalent to:

# mdadm --create --verbose /dev/md0 --level=1 --raid-devices=2 /dev/sdb1 /dev/sdc1

If alternatively you want to add a spare device in order to replace a faulty disk in the future, you can add '--spare-devices=1 /dev/sdd1' to the above command.

Answer "y" when prompted if you want to continue creating an array, then press Enter:

You can check the progress with the following command:

# cat /proc/mdstat

Another way to obtain more information about a RAID array (both while it's being assembled and after the process is finished) is:

# mdadm --query /dev/md0
# mdadm --detail /dev/md0 (or mdadm -D /dev/md0)

Of the information provided by 'mdadm -D', perhaps the most useful is that which shows the state of the array. The active state means that there is currently I/O activity happening. Other possible states are clean (all I/O activity has been completed), degraded (one of the devices is faulty or missing), resyncing (the system is recovering from an unclean shutdown such as a power outage), or recovering (a new drive has been added to the array, and data is being copied from the other drive onto it), to name the most common states.

Formatting and Mounting a RAID Array

The next step is formatting (with ext4 in this example) the array:

# mkfs.ext4 /dev/md0

Now let's mount the array, and verify that it was mounted correctly:

# mount /dev/md0 /mnt
# mount

Monitor a RAID Array

The mdadm tool comes with RAID monitoring capability built in. When mdadm is set to run as a daemon (which is the case with our RAID setup), it periodically polls existing RAID arrays, and reports on any detected events via email notification or syslog logging. Optionally, it can also be configured to invoke contingency commands (e.g., retrying or removing a disk) upon detecting any critical errors.

By default, mdadm scans all existing partitions and MD arrays, and logs any detected event to /var/log/syslog. Alternatively, you can specify devices and RAID arrays to scan in mdadm.conf located in /etc/mdadm/mdadm.conf (Debian-based) or /etc/mdadm.conf (Red Hat-based), in the following format. If mdadm.conf does not exist, create one.

DEVICE /dev/sd[bcde]1 /dev/sd[ab]1

ARRAY /dev/md0 devices=/dev/sdb1,/dev/sdc1
ARRAY /dev/md1 devices=/dev/sdd1,/dev/sde1
.....

# optional email address to notify events
MAILADDR your@email.com

After modifying mdadm configuration, restart mdadm daemon:

On Debian, Ubuntu or Linux Mint:

# service mdadm restart

On Fedora, CentOS/RHEL 7:

# systemctl restart mdmonitor

On CentOS/RHEL 6:

# service mdmonitor restart

Auto-mount a RAID Array

Now we will add an entry in the /etc/fstab to mount the array in /mnt automatically during boot (you can specify any other mount point):

# echo "/dev/md0 /mnt ext4 defaults 0 2" >> /etc/fstab

To verify that mount works okay, we now unmount the array, restart mdadm, and remount. We can see that /dev/md0 has been mounted as per the entry we just added to /etc/fstab:

# umount /mnt
# service mdadm restart (on Debian, Ubuntu or Linux Mint)
or systemctl restart mdmonitor (on Fedora, CentOS/RHEL7)
or service mdmonitor restart (on CentOS/RHEL6)
# mount -a

Now we are ready to access the RAID array via /mnt mount point. To test the array, we'll copy the /etc/passwd file (any other file will do) into /mnt:

On Debian, we need to tell the mdadm daemon to automatically start the RAID array during boot by setting the AUTOSTART variable to true in the /etc/default/mdadm file:

AUTOSTART=true

Simulating Drive Failures

We will simulate a faulty drive and remove it with the following commands. Note that in a real life scenario, it is not necessary to mark a device as faulty first, as it will already be in that state in case of a failure.

First, unmount the array:

# umount /mnt

Now, notice how the output of 'mdadm -D /dev/md0' indicates the changes after performing each command below.

# mdadm /dev/md0 --fail /dev/sdb1 #Marks /dev/sdb1 as faulty
# mdadm --remove /dev/md0 /dev/sdb1 #Removes /dev/sdb1 from the array

Afterwards, when you have a new drive for replacement, re-add the drive again:

# mdadm /dev/md0 --add /dev/sdb1

The data is then immediately started to be rebuilt onto /dev/sdb1:

Note that the steps detailed above apply for systems with hot-swappable disks. If you do not have such technology, you will also have to stop a current array, and shutdown your system first in order to replace the part:

# mdadm --stop /dev/md0
# shutdown -h now

Then add the new drive and re-assemble the array:

# mdadm /dev/md0 --add /dev/sdb1
# mdadm --assemble /dev/md0 /dev/sdb1 /dev/sdc1

Hope this helps.

Subscribe to Xmodulo

Do you want to receive Linux FAQs, detailed tutorials and tips published at Xmodulo? Enter your email address below, and we will deliver our Linux posts straight to your email box, for free. Delivery powered by Google Feedburner.


Support Xmodulo

Did you find this tutorial helpful? Then please be generous and support Xmodulo!

The following two tabs change content below.
Gabriel Cánepa is a GNU/Linux sysadmin and web developer from Villa Mercedes, San Luis, Argentina. He works for a worldwide leading consumer product company and takes great pleasure in using FOSS tools to increase productivity in all areas of his daily work. When he's not typing commands or writing code or articles, he enjoys telling bedtime stories with his wife to his two little daughters and playing with them, the great pleasure of his life.

23 thoughts on “How to create a software RAID-1 array with mdadm on Linux

  1. Thx. again to Mr. Canepa for this another good tutorial. Like in the other occasions, I want to make some observations about this very important topic:

    1. The most important thing for any mdraid user is the fact that he must know that any kind of RAID is not a replacement for the backups(as the author also mention)
    2. The mdraid has a design BUG (my personal opinion), and the user must know about it, if he does not want to have BAD events:
    - if the underlying disks have some bad data and this data can be read-able (and this is not un-common in these days), the mdraid will have 50% chances to read this BAD data, even if he has redudant data on the other member disk (in my oppinion, the mdraid code should read the same data form all the disk members, and if the data is the same on all disk, then will deliver it to the applications, and in a failed situation, to deliver an error to the aplications);

    - so if this can happen, in time, the mdraid device can have some corrupt data on it (I see this in 2 cases).

    3. It is wise to force from time to time a re-build of any mdraid device with something like this:

    echo check >> /sys/block/mdX/md/sync_action

    4. It is also wise to check the mdraid disk member with smartctl (aptitude install smartmontools):
    smartctl -t short /dev/sdX - daily
    smartctl -t long /dev/sdX - weekly

    5. If one of the disk member has some problem (has some data that is not read-able from the first attempt), the whole array will be very slow on any input/output operations, so is recommanded to remove this faulty drive from the array, as soon as possible.

    6. Some disks have TLER (Time Limited Error Recovery) / ERC (Error Recovery Control) / CCTL (Command Completion Time Limit): The ability for drives to remap sectors and silently recover from errors, so it is good for the any mdraid to set TLER so you can improve your chances for disks surviving in RAID setups, with this:

    smartctl -l scterc,70,70 /dev/sdX

    7. Because of point no. 2, in the last 3 years, I have been using a mdraid setup, only for the OS (/, /boot,/var, and so on), because it is easy to detect this corruption events (via md5sum/sha256), but not for the user data. For any user data I use ZFS (mirror), because I am sure that this kind of events (e.g. what I can read now is not identical to what I wrote in the past) are detected (any data is checksum-ed on and stored at the write time on the disk, and on any read, this check-sums are verifyed) AND repaired on the fly (mdraid can not do this).

    • Iulian, you make some good points - and some bad ones.

      We all agree that raid is not an alternative to backup, but it can never be stressed enough!

      md raid expects disks to either return the correct data, or to return an error message. There is no need for checksumming or reading both halves of a raid mirror - because disks /do/ either return the correct data or return an error message. There is a large amount of checksumming and ERC data on disks, giving vastly greater protection against "undetected read error" than even the sha-256 checksum option in ZFS. So if the disk returns data that it claims to be valid, but is not, then there is something seriously wrong with the firmware or the hardware, and you can't trust anything in your system - it's time to dig out the backups. Checksumming at the filesystem level can make it a little easier to spot such events, but since they are rarer than hens' teeth there is no need to worry. There are many reasons why one might want an advanced filesystem like ZFS (or btrfs, which is now a more natural choice on Linux) - detecting "undetected read errors" is not one of them.

      Sending "check" to sync_action does a check, not a re-build - and you are correct that it is useful from time to time (your distro may do this already). A "check" will read both halves of the mirror and compare them - a "re-build" will read one half and use it to overwrite the other half (which is only read if the first read fails).

    • Hello, I have some problem with auto-mount. After I reboot my server the device i created is gone and the mount fails. So I have to manually run mdadm --create again and mount for it to work. I don't seem to find any information about this error.

  2. Creating a 'RAID' partition is not required if you're gonna use the full disk, as a matter of fact, your RAID will perform better. Your command to initialize the RAID device thus becomes:
    mdadm --create --verbose /dev/md0 --level=1 --raid-devices=2 /dev/sdb /dev/sdc

    • ..... and if you will use the whole disk and one of them is broken you will need a new disk with at least the same capacity as the old one. In such a case it is recomended to make only one partition which has the capacity of the whole disk - 20 MB. This is necessary because you could get identical disks but the capacity differ with few MB.

  3. Thanks Iulian and William for your comments! As with the other topics I've written about, it is difficult to summarize all aspects of these subjects in a ~1000-word article - and that is where the comments come in handy to provide further details, tips and understanding. Thanks again!

    • @Rick,
      Not at all :).
      Just be careful and don't use the disk where your OS is installed to create a RAID array.
      Actually, as I said in the article, to implement any level of software RAID, you should not count on different partitions of the same physical disk. Even though it's possible, if the hard drive fails for some reason, there goes your RAID as well.
      So my advice would be, if you have a spare disk and want to test (don't do this in a production environment), you can partition it in 2 parts of identical size and use those 2 partitions to create your RAID1. On the other hand, if you're moving towards implementing a RAID1 in a production environment, get yourself at least 2 drives of the same size to implement it.
      If you have any further questions or would like help with this, don't hesitate to contact me at my email address: gacanepa[at]gmail(dot)com.
      Best regards,
      Gabriel

  4. I have three 1TB drives in a used server I bought to experiment with. The goal of a Raid installation would be to protect my main OS installation (Mint 17). If I establish a Raid array on Drives 2 and 3 how do I achieve the goal of protecting Drive 1?

    • Your setup requires a detailed explanation that cannot be adequately addressed in the comments section. Feel free to send me an email and I'll be more than glad to assist you.

  5. Great tutorial Gabriel. I don't like hardware RAID and this tutorial makes setting up a soft RAID-1 so easy. Thanks for your time.

    • @gerard,
      I am glad you liked this article. Thank you for your kind words. Stay tuned - I am working on an upcoming article on RAID10 now.

      • If you are writing about raid10, you need to talk about the flexibility of Linux raid10 and how it differs from traditional raid10 (which md raid also supports, of course). I recently read a blog in which the author went through various combinations of raids and partitioning schemes in order to improve on the speed of raid1, simply because he didn't know of the existance of "raid10,f2". I'm sure you'll get it right at the start, however!

  6. Instructions were versy simple and easy to follow. However, after reboot (ubuntu 14.04), my RAID1 failed to mount so I skipped mounting and it booted normally. The RAID had been assigned new (random?) mount point and was now called md127 rather than md0. How can I fix this problem?

  7. I think I just answered my own question. I found a post here (http://ubuntuforums.org/showthread.php?t=1764861). It seems mdadm is inexplicably renaming your array if it is not explicitly named in mdadm.conf. Post #6 at that link was exactly what I needed. I got the UUID from mdadm -D /dev/md127, then stopped the array, modified mdadm.conf and updated initramfs. Rebooted and array is mounting as expected and where I told it to (/home/user/RAID1).

  8. what partition type did you use? I thought you were supposed to use Linux RAID Autodetect (type fd) but your screenshots don't show any partition type being selected.

Leave a comment

Your email address will not be published. Required fields are marked *