Oct 152012

Backups are funny things … everyone says that they’re important, but actually it is restores that are important. Backups merely make restores possible. Because restores are so infrequent (and frankly backups so boring) there are far too many of us who do not spend enough time making sure everything is backed up properly.

This is not a blog entry saying how backups should be performed in the technical sense … I’m not going to suggest you use rsync to an offsite cloud storage provider, or how frequently the backups should be made. But rather a random wandering around the problems of backups.

Before I forget, I will be using a shortcut: When I say “to tape”, it merely means copying the backup to whatever medium you use. I’ve merely been doing backups long enough that “to tape” just flows naturally. In fact my current backup media is an external hard disk attached to an off-site server.

What Do You Want To Restore?

Before you can decide what you need to backup, you need to decide what you might want to restore. And what types of data you might want to restore. You can split up restores into three broad categories – the system data, the user data, and any databases there might be. Each has slightly different requirements – you might well back them all up in the same way, but it may also be better to back them up in three different ways.

Restoring Systems Data

If you backup the operating system of your computer using the same mechanism intended for backing up user data, then in a disaster situation you will be faced with the interesting situation that the data you need to make your computer functional again will be sat on a tape, an external disk, or (even worse) sitting on a patch of this cloud stuff that is currently so trendy.

Whilst this is perhaps not the ideal situation to be in, it is also not the end of the world. At least if you are not pushed for time to get the computer back up and running again.

There are basically two options here – either a dedicated computer imaging solution that clones your operating system disk in some way, or to use the original installation DVDs as a restoration method. The later may be the cheap option, but it does work to the extent of at least getting you going again. And of course lets you into those other backups you have made.

The decision on which to go for comes down to time – how long would it be acceptable for you to be without your data? Bearing in mind that the restore can only start after you have the hardware to restore it to.

Restoring Databases

When it comes to restoring from backups, databases can be a touch fragile. However it is worth pointing out that by this I mean real databases such as Oracle, PostgreSQL, or MySQL rather than database applications such as Microsoft Access. If you just copy the files making up a database to a backup tape, then the result can probably be restored but you may well end up with a corrupt database, and it may be missing some data.

The classic method of backing up a database is to shut it down, and then copy the files to tape. That is a pretty safe way of doing things, but if you are trying to run a 24×7 service, then it is rather inconvenient. That is not to say it is not still a good method – simply accept the need to shutdown the database once a week and concentrate on methods to minimise that downtime (filesystem snapshots work brilliantly here).

There are also database specific methods of generating a backup whilst the database is running. The lowest common denominator here is the equivalent of an export – the database generates a whole bunch of SQL commands which when run re-create the database. These methods can be used in combination with the old shutdown and copy to tape mechanisms to double your chances of getting a good backup.

And indeed allow you to minimise the disruption by only performing a shut down and copy to tape less frequently than every night.

Of course you probably do not have a database at home that you need to keep running 24×7, but some people will. But even if you don’t care how often the database gets taken down to perform a backup, you still should spend some time making sure that your database is backed up properly. It is too late to check when you are trying to perform a restore.

Oh! And if you do shut down your database to back it up, please remember to start it again afterwards!

Restoring Ordinary Files

Backing up ordinary files is definitely on the most boring side of backups. But for most of us, they are the most important backups we perform; as more and more of our important data becomes digital rather than physical, we need to be sure that our digital data is safe. And safe does not mean just safe from the odd hard disk failure, but from disasters such as house fires!

Or from foolish decisions to delete the wrong files, etc. We tend to assume that restores are performed after disasters such as the aforementioned hard disk crash, but in practice – at least in an organisation with a team responsible for performing restores – files are restored almost continuously and far more frequently than hard disk crashes.

You can choose to backup everything – which means you can be sure that you have everything you need to restore in an emergency, although it can be a lot slower as you will be backing up a lot of “junk”, or you can be very selective in what you backup which makes things a lot quicker, but there is always the danger that something important will be lost.

Or you could do both! It is perfectly sensible to backup only the most important files every day – perhaps to DropBox – and then do a full backup once a week.

One thing to look for is something along the lines of Apple’s Time Machine; there are approximate equivalents for Windows and Linux, and the key advantage that all of them has, is the ability to perform differential backups which means that only the changed files are copied. My own backup script ran last night and ‘refreshed’ a backup of nearly 500Gbytes in about 7 minutes (and that was to a very remote server).

And use those backups! Checking whether the backups have worked or not is another tedious job, but using yesterday´s files is far less tedious.

A Few Misconceptions

  1. RAID isn’t a backup method. You can mirror your hard disks (I do), but that merely reduces the probability of a hard disk crash causing you to reach for the backup tapes. That is not to say that it is worthless, but you still need to perform backups.
  2. Tape isn’t dead. It may well be too expensive for home use, but tape is still a perfectly sensible way of keeping backups. It can be “enhanced” in various ways such as snapshots to give the impression of backups being performed very quickly, or a disk buffer to keep the most recent backup online.
  3. Cloud backup solutions are cool, but not without issues. For a start you have to worry about the legal aspects (especially if you are a business), such as whether the backups are stored within an acceptable jurisdiction. In addition, what happens if the cloud storage provider goes out of business for some reason ? There are quite a few people who could tell you the problems of using certain cloud storage providers who have for one reason or another ceased operation.
  4. A backup on the same disk as the source files may well be a poorly considered option. After all, it will not help you if the hard disk goes “bang”. But it could be quite a good supplementary option to another method. Similarly an external disk is all very well, but will not help you much in the case of a house fire.
  5. Whatever backup method you choose is subject to failure. The external hard disk that fails just when you need it, the encrypted cloud backup where you’ve forgotten the passphrase because it’s held in a password store on the disk you’re trying to recover, etc. Having multiple backup destinations is worth considering especially when so many cloud storage providers are giving so much space away for free.