Automated Webserver Backup

Often the first step in working on a website is downloading the website to your computer so you can work on a local copy.  Websites can be fairly large (especially if they have video files) and these downloads can take some time.  What if the files for the website were already on your computer?  Problems could be fixed a lot faster and if there is a problem with the webserver you have a backup that can be used to restore the website.  This focus of this article is on the files on the webserver.  The database is another topic although the approach discussed could be extended to the database without a lot of difficulty.

The approach here will use rdiff-backup with an encrypted connection that downloads just the files that have changed.  Furthermore, rdiff-backup allows recovery of the files at any point in time.  rdiff-backup is available on Linux, Mac and Windows although the Windows file system does not distinguish between lower case and capital letters which causes some difficulties backing up a Linux webserver.  With Windows, using a virtual machine running Linux is suggested.  Also, many shared hosting servers do not allow root access or even shell access which makes automated backups very difficult.

rdiff-backup should be installed on both your local computer and your webserver.  Create a pair of keys:

[user@local]$ sudo su -

[root@local]# ssh-keygen -t rsa

Save the key to /root/.ssh/id_rsa_webserver_backup.  Do not enter a passphrase.  We need to move the public key to the webserver.  Use the following command:

[root@local]# cd /root/.ssh

[root@local]# cat id_rsa_webserver_backup.pub | ssh root@example.com 'sh -c "cat - >>~/.ssh/authorized_keys;chmod 600 ~/.ssh/authorized_keys;"';

where example.com is the domain name of your webserver.  Some Linux distributions have a shorthand command (ssh-copy-id root@example.com) for this step.  Now test the key to make sure you can log in without a password:

[root@local]# ssh -i id_rsa_webserver_backup root@example.com

If you have done everything correctly then you should be able to log into the webserver without a password.  Note that  this is more secure than using a password assuming your local computer is physically secure.  Just carrying out these steps already have provided a useful way to access your webserver.  Note that sometimes the root account is blocked from direct login for security reasons which will require using user accounts that may not have full access to the files that need to be backed up.

Now we will create an identity file so that this is even easier.

[root@local]# vi /root/.ssh/config

host webserver-backup
        hostname example.com
        user root
        identityfile /root/.ssh/id_rsa_webserver_backup
        compression yes
        protocol 2

[root@local]# chmod 600 /root/.ssh/config

Now test the identity file:

[root@local]# ssh webserver-backup

You should be able to log into the webserver without the password, using the keys.

Now we add a bit of security.  Log into the webserver.

[user@webserver]$ sudo su -

[root@webserver]# vi /root/.ssh/authorized_keys

Prepend the following to the key:

command="rdiff-backup --server --restrict-read-only /",no-port-forwarding,no-X11-forwarding ssh-rsa ...

This restricts the use of the keys to rdiff-backup and only allows reading files.  The "/" allows access to the entire file system.  If you want to only allow access to part of the file system, replace this with the path.

Now test a simple backup:

[root@local]# rdiff-backup webserver-backup::/tmp test-backup

You should not have to supply a password.  We have enough functionality that we could create a cron job right now but let's customize things a bit more.  Create a script:

[root@local]# vi /root/rdiff-backup.sh

#!/bin/sh
export HOME=/root
rdiff-backup --print-statistics \
        --include-globbing-filelist /etc/backup-source.conf \
        webserver-backup::/ /var/backups/webserver

In this case we are specifying /var/backups as the location for the backups to reside on the local computer.  We are also specifying a configuration file for specifying the directories that will be backed up.  Make the script executable.

[root@local]# chmod 700 rdiff-backup.sh

Now, create the configuration file:

[root@local]# vi /etc/backup-source.conf

- /bin - /boot
- /dev
- /lib
- /media
- /mnt
- /opt
- /proc
- /sbin
- /srv
- /sys
- /tmp
- /usr
- /var/cache
- /var/crash
- /var/lib
- /var/local
- /var/lock
- /var/opt
- /var/run
- /var/temp

The lines starting with a dash indicate that those paths should be excluded.  A path with no preceeding symbol indicates the path should be included.

To allow cron to run the script daily, move the script:

[root@local]# mv /root/rdiff-backup.sh /etc/cron.daily/

To test the script:

[root@local]# /etc/cron.daily/rdiff-backup.sh

Now you should have a local copy of the files on the webserver in /var/backups/webserver.  If you want the latest files from the webserver simply copy them from your local backup to your local website root, install a copy of the database and fix some permissions and you should have a copy of the website running on your local computer.  If you want to use an older version of the webserver files then you will need to use rdiff-backup with the restore mode to retrieve the older versions from your local backup files.

Categories