We’ve all heard the reasons for backing up our data regularly -- accidental deletion of files (rm -rf *
), corrupted files from crashed applications, the dreaded hard disk failure, the list goes on. Nevertheless, on average, only 25 per cent of computer users perform routine backups of their data, as shown by a recent Harris Interactive survey. So why do the remaining 75 per cent put off this important task? Well, manual backups are often an adhoc measure, unreliable, and time-consuming. Automating an otherwise tedious backup process is key to producing routine and reliable backups. With that in mind, we’ll take a look at rsnapshot, a handy backup utility based on rsync, a well-known open source tool.
rsnapshot was written by Nathan Rosenquist as a replacement for a patchwork of complex shell scripts he had crafted to do rsync backups. Any changes to the backup scheme meant manually editing the scripts, making sure no bugs were introduced. rsnapshot was a great improvement over this process, it was easy to configure, portable across different operating systems, supported remote backups, and best of all, automated the entire backup process.
rsnapshot enables users to keep multiple backups of their data, from local or remote systems, readily accessible. Each backup is a complete snapshot of the data at a specific point in time. rsnapshot minimizes disk space usage by utilizing hard links (multiple entries in the file system to share a single data entity) and rsync. Thus, the total amount of disk space used is the space for one full backup, plus any incremental snapshots.
Since rsnapshot is written entirely in Perl, its a snap to install on most modern versions of Linux or BSD. In fact, rsnapshot comes pre-installed on Debian, Gentoo, FreeBSD, OpenBSD, and NetBSD. Users with other distributions can compile and install rsnapshot by downloading the latest version from www.rsnapshot.org.
Install rsnapshot
To get started I will download and install rsnapshot (v1.2.1) on my Fedora Core 4 system (mango
). If you're are using a distribution that already has rsnapshot installed, just skip to the next section.
To install rsnapshot you will need to have both perl
(v5.004+) and rsync
available on your system. Although, not required, it helps to have OpenSSH, BSD logger
, GNU cp
, and GNU du
, available as well. If you have perl
and rsync
on your system, follow the simple instructions below to install rsnapshot.
$ wget -q http://www.rsnapshot.org/downloads/rsnapshot-1.2.1.tar.gz
$ tar xzf rsnapshot-1.2.1.tar.gz
$ cd rsnapshot-1.2.1
$ ./configure --prefix=/usr/local --sysconfdir=/etc
The --sysconfdir=/etc
parameter above tells rsnapshot to look for its configuration file (rsnapshot.conf
) in /etc
. Installing rsnapshot requires root privileges.
$ make install
Make sure rsnapshot is available in your command search path.
$ whereis rsnapshot
rsnapshot: /usr/local/bin/rsnapshot
Configure rsnapshot
For the purposes of this article, we will use rsnapshot to backup data from one Linux system (kiwi
) to another (mango
). rsnapshot will run on mango
, which will also host the backup archives. Both systems should have rsync and ssh installed.
All configuration parameters of rsnapshot are controlled via the rsnapshot.conf
file. Before we setup rsnapshot, we'll copy the default configuration file /etc/rsnapshot.conf.default
and save it as /etc/rsnapshot.conf
. This way we can revert back to a clean configuration if we mangle our config file.
Now, let’s edit rsnapshot.conf
on mango
to setup our backup system. Most of the parameter defaults do not need modification, so we’ll just focus on those that do.
Where will backups be stored?
The snapshot_root
parameter in the SNAPSHOT ROOT DIRECTORY
section specifies the directory where rsnapshot will place backup snapshots as they are created. Make sure you select a disk partition with adequate free space to hold your backups.
# Note: Use TABS (not spaces) to separate
# the configuration directive and the value.
# If specifying a directory, put a
# slash at the end.
snapshot_root /usr2/snapshots/
If you plan on using an USB/FireWire hard disk for storing backups, then the no_create_root
parameter should be set to 0
. This tells rsnapshot to create the snapshot root directory if it doesn’t already exist.
Which external programs will rsnapshot use?
Next, the EXTERNAL PROGRAM DEPENDENCIES
section contains parameters to specify paths for optional external tools that rsnapshot depends on to provide certain features. Be sure to uncomment the lines starting with cmd_cp
, cmd_ssh
, and cmd_du
by removing the hash (#
) mark at the beginning of the line.
# use GNU cp
cmd_cp /bin/cp
# use ssh for secure remote backups
cmd_ssh /usr/bin/ssh
# use GNU du to check disk space usage
cmd_du /usr/bin/du
How often will backups happen?
The configuration parameters in the BACKUP INTERVALS
section determine how often rsnapshot will perform backups and how many snapshots will be kept. The keyword interval
is followed by an alphanumeric label, followed by a number, signifying how many intervals to keep.
In our backup system, we want to take a snapshot of kiwi
every 3 hours, so that's 8 snapshots per day. Each time rsnapshot hourly
is executed, it will create a new snapshot, rotate the old ones, and retain the 8 most recent (hourly.0 - hourly.7
) snapshots. We also want to take a daily snapshot, and keep a week's (7 days) worth of snapshots.
#interval minutes 6
interval hourly 8
interval daily 7
#interval weekly 4
The order of the interval definitions is very important. The first interval
line must represent the smallest unit of time, with each subsequent line representing a larger interval. If you were to add a weekly interval, it would appear after the daily interval. Similarly, a minutes interval would appear before hourly.
What is included or excluded from the backup?
Most of the parameters in the GLOBAL OPTIONS
section can be left at their default values. However, there are two parameters that you can use to include or exclude files from the backup. Both parameters get passed directly to rsync, so take a look at the --include
and --exclude
options in the rsync man page for a thorough explanation of how to construct match patterns. If you prefer listing all your include/exclude patterns in separate files, specify them using the include_file
and exclude_file
parameters.
Here are some simple examples to get you started.
# exclude anything starting with a dot character (.)
exclude .*
# exclude anything ending with a tilde character (~)
exclude *~
# include .ssh directory
include /home/nsharma/.ssh/
What should be backed up?
The BACKUP POINTS / SCRIPTS
section tells rsnapshot what is to be backed up and where the backup snapshot is stored. This part is very important, so pay attention. We will use rsync over ssh to backup two directories and a file from the system named kiwi
, and store the snapshots in a directory named kiwi_backups
. The hostname kiwi
must resolve to an IP address, either via DNS or the /etc/hosts
file.
# two directories (/home/nsharma, /my_articles)
backup root@kiwi:/home/nsharma/ kiwi_backups/
backup root@kiwi:/my_articles/ kiwi_backups/
# one file
backup root@kiwi:/etc/passwd kiwi_backups/
The configuration above will only work if we can login (without manually entering passwords) to kiwi
as root
via ssh. The easiest way to setup access is by creating "passphraseless" keys with ssh-keygen
, and here’s how to do it.
Setting up "passphraseless" keys
Login as root
on mango
Use the ssh-keygen
program to create a public/private key pair with Digital Signature Algorithm (DSA) encryption
$ ssh-keygen -t dsa
Generating public/private dsa key pair.
Enter file in which to save the key (/root/.ssh/id_dsa):
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_dsa.
Your public key has been saved in /root/.ssh/id_dsa.pub.
The key fingerprint is:
0d:f0:ea:bc:b8:0d:69:c6:6d:e0:59:c2:ee:31:4d:90 root@mango.private.dom
Transfer public key from mango
to kiwi
using scp
$ scp .ssh/id_dsa.pub root@kiwi.private.dom:mango.pub
root@kiwi.private.dom's password:
id_dsa.pub 100% 619 0.6KB/s 00:00
Login as root
on kiwi
Install mango
public key
$ cat mango.pub >> /root/.ssh/authorized_keys
Delete mango.pub
file from kiwi
$ rm -f mango.pub
We should now be able to login to kiwi
as root
from mango
without being prompted for a password.
If you’re uncomfortable with the idea of "passphraseless" keys, then take a look at the ssh-agent
man page and a utility called keychain
available at www.gentoo.org/proj/en/keychain/index.xml.
Testing our configuration
Before we run rsnapshot for the first time, we should make sure the syntax of our configuration file is correct, and execute a dry run of each interval we have defined.
Checking for correct syntax
$ rsnapshot configtest
rsnapshot will either show you the errors, or a Syntax OK
message if there are no errors.
Dry run for each interval
# test run for 'hourly' interval
$ rsnapshot -t hourly
# test run for 'daily' interval
$ rsnapshot -t daily
The output from each command will show you exactly what rsnapshot will do for the specified intervals.
Automating the backup process
Our next and final step is to automate the execution of rsnapshot on mango
. We'll add two entries to the cron
scheduling server to request execution of rsnapshot every 3 hours on the hour, for the hourly interval, and every night at 11:00 pm, for the daily interval. Logged in as root
on mango
, we’ll invoke the crontab
program with the edit (-e
) option. The crontab
invokes the default editor, as specified using the VISUAL
or EDITOR
shell environment variables.
$ crontab -e
Now, we add the following entries and save and close the file.
0 */3 * * * /usr/local/bin/rsnapshot hourly
0 23 * * * /usr/local/bin/rsnapshot daily
That’s it, we now have a fully automated backup system which creates hourly and daily snapshots of our data. For detailed documentation about rsnapshot, check out the rsnapshot man page and the rsnapshot website at www.rsnapshot.org.
Conclusion
Knowing what data to preserve and how to recover it in an emergency is critical to having a solid backup plan. Using the right tools to implement that backup plan is just as important. Take control of your backups with rsnapshot!
Before we finish, here’s an actual run of rsnapshot against the hourly interval.
$ rsnapshot -v hourly
echo 19462 > /var/run/rsnapshot.pid
mkdir -m 0755 -p /usr2/snapshots/hourly.0/
/usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded \
--include=/home/nsharma/.ssh/ --exclude=.* --exclude=*~ \
--rsh=/usr/bin/ssh root@kiwi:/home/nsharma/ \
/usr2/snapshots/hourly.0/kiwi_backups/
/usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded \
--include=/home/nsharma/.ssh/ --exclude=.* --exclude=*~ \
--rsh=/usr/bin/ssh root@kiwi:/my_articles/ \
/usr2/snapshots/hourly.0/kiwi_backups/
/usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded \
--include=/home/nsharma/.ssh/ --exclude=.* --exclude=*~ \
--rsh=/usr/bin/ssh root@kiwi:/etc/passwd \
/usr2/snapshots/hourly.0/kiwi_backups/
touch /usr2/snapshots/hourly.0/
rm -f /var/run/rsnapshot.pid
No comments:
Post a Comment