Fast & frequent incremental ZFS backups with zrep

Recently, I replaced one of the Windows fileservers at $dayjob with a Debian-based Samba server. For the data storage, I chose ZFS on Linux. I have been running ZFS on Linux on my backup servers for a while now, and it seems that with the latest release (0.6.4.1, dated April 23 2015), most, if not all of the stability problems I had with earlier versions are gone.

ZFS backups

ZFS has a few features that make it really easy to back up efficiently and fast:

  1. Cheap snapshots.
  2. ZFS send / receive.

ZFS snapshots are cheap in that they don’t cost any performance. ZFS is a copy-on-write filesystem, which means that every block that needs changing, is first copied to a new block, and only after the update succeeds, the reference in the filesystem is updated to the new block. The old block is then freed, unless it is part of a snapshot, in which case the data is simply left intact. You can make many snapshots of a single ZFS filesystem (theoretically 264 in a storage pool) without any cost other than the disk space they consume, which is equal to the growth of your data (or rather: accumulated size of all changes) since the oldest snapshot.

ZFS allows you to take a shapshot and send it to another location as a byte stream with the zfs send command. The byte stream is sent to standard output, so you can do with it what you like: redirect it to a file, or pipe it through another process, for example ssh. On the other side of the pipe, the zfs receive command can take the byte stream and rebuild the ZFS snapshot. zfs send can also send incremental changes. If you have multiple snapshots, you can specify two snapshots and zfs send can send all snapshots inbetween as a single byte stream.

So basically, creating a fast incremental backup of a ZFS filesystem consists of the following steps:

  1. Create a new snapshot of the filesystem.
  2. Determine the last snapshot that was sent to the backup server.
  3. Send all snapshots, from the snapshot found in step 2 up to the new snapshot created in step 1, to the backup server, using SSH:
zfs send -I <old snapshot> <new snapshot> | ssh <backupserver> zfs receive <filesystem>

Of course, on the backup server you can leverage some of the other great features of ZFS: compression and deduplication.

Enter zrep.

Zrep

Zrep is a shell script (written in Ksh) that was originally designed as a solution for asynchronous (but continuous) replication of file systems for the purpose of high availability (using a push mechanism). It was later expanded with the possibility to create backups of a filesystem using a pull mechanism, meaning the replication is initiated from the backup server and no SSH access is needed to the backup server, as it would be with push-replication.

Zrep is quite simple to use and it has good documentation, although setting it up for use as a backup solution took me a few attempts to get right. I won’t go into the gory details here, but I’ll describe my setup.

It basically works like this:

  • Zrep needs to be installed on both sides.
  • The root user on the backup server needs to be able to ssh to the fileserver as root. This has security implications, see below.
  • A cron job on the backup server periodically calls zrep refresh. Currently, I run two backups hourly during office hours and another two during the night.
  • Zrep sets up an SSH connection to the file server and, after some sanity checking and proper locking, calls zfs send on the file server, piping the output through zfs receive:
ssh <fileserver> zfs send -I <old snapshot> <new snapshot> | zfs receive <filesystem>
  • Snapshots on the fileserver need not be kept for a long time, so we remove all but the last few snapshot in an hourly cron job (see below).
  • Snapshots on the backup server are expired and removed according to a certain retention schedule (see below).

SSH access and security

Since all ZFS operations, like making snapshots and using send / receive require root privileges (at least on Linux by default; other OSs like Solaris are more flexible in this, and even Linux may allow you to chown/chmod /dev/zfs to delegate these privileges – see this issue on Github for more information) , zrep must also run as root on both ends. This means that root needs SSH access to the fileserver, which could be a huge security problem. What I usually do to mitigate this as much as possible, is:

  1. SSH access is firewalled and only allowed from IPs that need to have access. Between datacenters, I use VPNs and externally routeable IP adresses generally do not have SSH access.
  2. PasswordAuthentication no
  3. PermitRootLogin forced-commands-only
  4. Use an SSH keypair specific to this application, and configure an entry for the fileserver in /root/.ssh/config on the backup server, using this key.
  5. Use a wrapper script for zrep and specify this as a forced command in root’s authorized_keys.
  6. Use a list of ‘from’ IPs (containing only your backup server(s)) for this specific key in root’s authorized_keys to restrict access even beyond the firewall.

The wrapper script can check the command it gets from the client and only exec the original command if nothing smells fishy.

I guess it would also be possible to use sudo, instead of granting root SSH access, but I haven’t tested this. If you would like to try it out: zrep allows for specifying the path to the zrep executable on the remote end using an environment variable (ZREP_PATH), so maybe it’s as easy as calling:

ZREP_PATH="sudo zrep" zrep refresh <pool/fs>

Since zrep on the backup server would still run as root, you would need to configure the user account to use in the SSH connection in /root/.ssh/config. For example:

Host myfileserver
    User zrep
    IdentityFile /root/.ssh/id_rsa_zrep

And of course you would need to configure sudo on the fileserver to allow the user of choice to execute zrep (or the wrapper script) as the root user without specifying a password. Let me know if it works, when you give it a try.

Backup retention / expiration

When using the ‘refresh‘ command, zrep does not automatically expire old snapshots like it does when using the more standard ‘sync’ replication features, so we have to trigger snapshot expiration by hand. Zrep keeps track of the snapshots that it makes and ships to the backup server by recording the timestamps in ZFS custom properties. It also provides an expire command, that lets you clean up old snapshots, but it is very rudimentary. It only allows you to specify a number of snapshots to keep (5 by default, but this value is changeable for local and remote snapshots independently) and if you call zrep expire, it just deletes all but the last 5 snapshots. This is fine for the fileserver itself, so we run an hourly cronjob, just telling zrep to expire all but the last 5 snapshots. The -L flag tells zrep to leave any remote (replicated) snapshots alone:

17 *    * * *   root    /usr/local/bin/zrep expire -L >/dev/null

On the backup server however, I would like a more sophisticated rentention schedule. We make snapshots frequently and even though there is no real technical benefit to cleaning them up, I don’t really want to keep all of them around. I’d like to retain backups to a more traditional schedule, like:

  • every snapshot for the past 48 hours
  • one backup daily for 15 days
  • one weekly backup for 4 weeks
  • one backup monthly for 6 months
  • a yearly snapshot for a couple of years

For this, we use a Python script called zrep-expire (Github). Zrep-expire looks at the creation time of a snapshot and checks it against an expiration rule. If, according to the rule, the snapshot is expired, it destroys the snapshot. The crontab entry looks like this:

55 6 * * * /usr/local/bin/zrep-expire -c /etc/zfs/zrep-expire.conf

Listing and restoring backups

A list of all snapshots with the most interesting properties can be viewed with:

/sbin/zfs list -t snapshot -o name,zrep:sent,creation,refer,used

ZFS on Linux, like Oracle Solaris ZFS, exposes snapshots in a .zfs/snapshot directory in the root of the filesystem. Please note that the .zfs directory is hidden and you will not see it even with ls -a. Each snapshot has an entry with its name in .zfs/snapshot, and the root user can cd into those subdirectories and copy files from there to their original location, or anywhere s/he wants. Remember that snapshots are read-only, so you cannot change any data there.

 

Thanks to Philip Brown, the author of Zrep, for some useful feedback on this post.