How to make remote incremental backup of LUKS-encrypted disk/partition

Some of us have our hard drives at home or on a VPS encrypted by Linux Unified Key Setup (LUKS) for security reasons, and these drives can quickly grow to tens or hundreds of GBs in size. So while we enjoy the security of our LUKS device, we may start to think about a possible remote backup solution. For secure off-site backup, we will need something that operates at the block level of the encrypted LUKS device, and not at the un-encrypted file system level. So in the end we find ourselves in a situation where we will need to transfer the entire LUKS device (let's say 200GB for example) each time we want to make a backup. Clearly not feasible. How can we deal with this problem?

A Solution: Bdsync

This is when a brilliant open-source tool called Bdsync (thanks to Rolf Fokkens) comes to our rescue. As the name implies, Bdsync can synchronize "block devices" over network. For fast synchronization, Bdsync generates and compares MD5 checksums of blocks in the local/remote block devices, and sync only the differences. What rsync can do at the file system level, Bdsync can do it at the block device level. Naturally, it works with encrypted LUKS devices as well. Pretty neat!

Using Bdsync, the first-time backup will copy the entire LUKS block device to a remote host, so it will take a lot of time to finish. However, after that initial backup, if we make some new files on the LUKS device, the second backup will be finished quickly because we will need to copy only that blocks which have been changed. Classic incremental backup at play!

Install Bdsync on Linux

Bdsync is not included in the standard repositories of Linux distributions. Thus you need to build it from the source. Use the following distro-specific instructions to install Bdsync and its man page on your system.

Debian, Ubuntu or Linux Mint

$ sudo apt-get install git gcc libssl-dev
$ git clone https://github.com/TargetHolding/bdsync.git
$ cd bdsync
$ make
$ sudo cp bdsync /usr/local/sbin
$ sudo mkdir -p /usr/local/man/man1
$ sudo sh -c 'gzip -c bdsync.1 > /usr/local/man/man1/bdsync.1.gz'

Fedora or CentOS/RHEL

$ sudo yum install git gcc openssl-devel
$ git clone https://github.com/TargetHolding/bdsync.git
$ cd bdsync
$ make
$ sudo cp bdsync /usr/local/sbin
$ sudo mkdir -p /usr/local/man/man1
$ sudo sh -c 'gzip -c bdsync.1 > /usr/local/man/man1/bdsync.1.gz'

Perform Off-site Incremental Backup of LUKS-Encrypted Device

I assume that you have already provisioned a LUKS-encrypted block device as a backup source (e.g., /dev/LOCDEV). I also assume that you have a remote host where the source device will be backed up (e.g., as /dev/REMDEV).

You need to access the root account on both systems, and set up password-less SSH access from the local host to a remote host. Finally, you need to install Bdsync on both hosts.

To initiate a remote backup process on the local host, we execute the following command as the root:

# bdsync "ssh root@remote_host bdsync --server" /dev/LOCDEV /dev/REMDEV | gzip > /some_local_path/DEV.bdsync.gz

Some explanations are needed here. Bdsync client will open an SSH connection to the remote host as the root, and execute Bdsync client with --server option. As clarified, /dev/LOCDEV is our source LUKS block device on the local host, and /dev/REMDEV is the target block device on the remote host. They could be /dev/sda (for an entire disk) or /dev/sda2 (for a single partition). The output of the local Bdsync client is then piped to gzip, which creates DEV.bdsync.gz (so-called binary patch file) in the local host.

The first time you run the above command, it will take very long time, depending on your Internet/LAN speed and the size of /dev/LOCDEV. Remember that you must have two block devices (/dev/LOCDEV and /dev/REMDEV) with the same size.

The next step is to copy the generated patch file from the local host to the remote host. Using scp is one possibility:

# scp /some_local_path/DEV.bdsync.gz root@remote_host:/remote_path

The final step is to execute the following command on the remote host, which will apply the patch file to /dev/REMDEV:

# gzip -d < /remote_path/DEV.bdsync.gz | bdsync --patch=/dev/DSTDEV

I recommend doing some tests with small partitions (without any important data) before deploying Bdsync with real data. After you fully understand how the entire setup works, you can start backing up real data.

Conclusion

In conclusion, we showed how to use Bdsync to perform incremental backups for LUKS devices. Like rsync, only a fraction of data, not the entire LUKS device, is needed to be pushed to an off-site backup site at each backup, which saves bandwidth and backup time. Rest assured that all the data transfer is secured by SSH or SCP, on top of the fact that the device itself is encrypted by LUKS. It is also possible to improve this setup by using a dedicated user (instead of the root) who can run bdsync. We can also use bdsync for ANY block device, such as LVM volumes or RAID disks, and can easily set up Bdsync to back up local disks on to USB drives as well. As you can see, its possibility is limitless!

Feel free to share your thought.


Subscribe to Xmodulo

Do you want to receive Linux FAQs, detailed tutorials and tips published at Xmodulo? Enter your email address below, and we will deliver our Linux posts straight to your email box, for free. Delivery powered by Google Feedburner.


Support Xmodulo

Did you find this tutorial helpful? Then please be generous and support Xmodulo!

The following two tabs change content below.

Iulian Murgulet

Iulian Murgulet is a Linux administrator from Romania. He taught Linux in the past, and worked at NETC@RDS project of EU as a technical expert. Currently he works for the County Health Insurance House in Brasov (situated near the "Dracula's Castle"). In his free-time he likes to read and/or watch anything about history and biology. He also likes to know good people from other countries.

10 thoughts on “How to make remote incremental backup of LUKS-encrypted disk/partition

  1. Another possibility is to export the encrypted block device using iSCSI. Mount it locally using LUKS and do a 'local' rsync. The disadvantage is that you probably need to be root on both sides to use iSCSI.

    • Yes it is possible, but you will need to add an additional layer. So for sure the performance will be less than optimal. And not to mention that you will need to adjust the block-size for all these layers (LUKS/iSCSI). And what if you lose the link between the source and remote host?

  2. Pretty cool util!

    I've been experimenting with "encfs --reverse" as a backup option as well though, since it can be used with any standard backup utilities like rsync.

    • Yes, encfs with --reverse option should work! This is another idea which proves that for one problem there could be several solutions. Each solution will have its own advantages and disadvantages.

  3. Thank you very much, Murgulet Iulian. Excellent and very comprehensive tutorial. I am thinking about implementation of this solution in KVM virtual environment as regular incremental backup of LUKS encrypted virtual disks (/dev/vda, /dev/vdb...etc) of VMs. The idea is to use NFS share on storage in LAN for collecting archived files .bdsync.gz instead local path. Is it possible to use only one system (local_host) to complete first action according to your guide instead local_host and remote_host? In that case it would be possible to make automation task little easier and complete all three steps independently on each VM.

    • First I must thank you for your kind words. I do not think it is a very good tutorial. Any kind words must go to the Rolf Fokkensen, and not me. Now about your question, I understood that you have a server with some KVM guests on it. I also guess (your description is not very clear to me) that you want to use the first step in a KVM guest and save the patch file on an NFS server and not on localhost. If my guess is correct, the answer is yes. It is no difference if the path is on localhost or in a network resource (NFS in this case). You must be sure that when you create the patch file you must not use this LUKS block devices. I do not have experience with kvm, but if I must use your setup, I would try to use zfs as a backend volume manager for this LUKS devices, but this is another story!

    • Hello Claudio,

      Thanks a lot for your opinion!

      I did not try "blocsync.py", but at the first look on it, I guess it is slow compared with bdsync, because bdsync is coded in C. Another reason is the fact that bdsync has an interesting option:

      "--twopass
      Makes bdsync first match checksums using large blocks (64 * BLOCKSIZE) and then match checksums using small blocks (BLOCKSIZE). This may reduce systemcall overhead and network traffic when the "binary patchfile" has limited size."

      This option will help a lot, if you have a small numbers of changed blocks.

  4. bdsync is unreliable, and I have had several instances where data was outright garbage.
    At least it was easy to spot because the whole partition was trash - but with that, I can not guarantee consistency at all. Who knows if all the blocks are matched correctly? I cannot vouch for it, and hence ceased using it almost immediately.

    # nice ionice bdsync "ssh srv2 bdsync --server" /dev/sda2 /dev/sda2 | nice ionice gzip > test.bdsync.gz

    # Copied the file over.
    # On the target server:

    # gzip -d test.bdsync.gz | bdsync --patch=/dev/sda2
    do_patch: EOF(stdin)

    Sizes identical, hashes identical, /dev/sda2 on the target (not in use, ofc) garbage.
    Some partitions work, others do not. For a backup, this is pure gamble.

    • Sorry for my late reply Lyo Mi ... ;)

      I do not know what to say .... anyway, your result makes me sad! I use this setup (even now), and it works for me. I can only guess that it could be a problem in your environment. Maybe you have some network problems (some network packages could be corrupted after they go on the wire/wifi), or bad RAM/PSU (hardware related).
      It is true that in my case I use a zfs block device (on both ends for source and for destination), and I am sure that that in my case any data are error-free. Anyway, I would try to make the same things, but instead of a remote host, try the same host, with the same source (and use a USB disk for the destination connected on the same source HOST, and ssh will make the connection on the localhost).

      Thx for your opinion!

Leave a comment

Your email address will not be published. Required fields are marked *