Data Expedition, Inc.

Move Data Faster

Seth Noble

Seth Noble
Unix Tips
The Kernel
Memory
Filesystems
Manual Pages
Commands
Backing Up
HD Recovery
Page Index:
Hardware
Software
 Time Machine Tips
Recommendations

Tip History
Mar2012Minor Updates
Jan2011Major Update!
Reformatted
Sep2005Reformatted
Feb2004Reformatted
Feb1997Original Article

Backups

This article is focused on OS X, but generally applies to all operating systems.

Introduction

I have been a fervent believer in backups since I accidentally wiped out three months worth of work on an Apple BASIC database application I had been writing back in middle school.  Over the years I have worked with a lot of different computer systems and had to do disaster recovery on several occasions.  I've come to very much appreciate backup methods that are accurate and reliable, especially when it comes to recovery time.

So here's my advice on backups.  It has changed dramatically over the years as software and hardware have evolved.  The biggest change is that external hard drives are now cheap, easy, and large.

The first question is what hardware do you intend to use for your backups? In all cases, your backup media should be portable enough to be stored as far away from your computer as practical, in a cool, dry, dark, fire-proof place, and not near anything electric or magnetic.

The second question will be what software to use.

Or, you can just skip straight to my recommendations.


Hardware

The main issue here is price versus reliability.  Remember, it doesn't do any good to make a backup, if you can't recover the data when you need it.

Whatever hardware you use, redundancy is key.  You should always have TWO backups and always keep one of them as far away from your computer as possible.  That's because whenever your backup hardware or media is connected to the computer, it is vulnerable to all the same disasters that might cause you to need a backup.  Likewise, if your system has any problems at the time of the backup (like a corrupted file that you haven't noticed yet), your backup will have those problems too.  The best policy is to alternate between at least two different media.  For example, change hard-drives or media every month or so, thereby ensuring that at least one copy is always safe.

• External Hard Drives

This is the most cost effective solution for backing up a single system, or even a small number of systems.  For not much money you can get an external drive that is larger than your system's internal drive.

For a single system, mount the drive directly to the computer.  Direct attached storage is much faster, more reliable, and will avoid disrupting your network connection during backups.  eSATA or Firewire will give you the fastest backups, for an extra price.  But if your software will be doing live backups (while you are using the computer for other things), you may want to stick with USB.  In addition to being cheaper, USB will slow down the backup so that it does not interfere as much with your use of the main hard drive.

Network attached storage is useful when backing up multiple systems to a single drive, if your software allows that.  But try to avoid self-contained NAS appliances: you do not want the backup drive tied to a particular vendor and you want to be able rotate between at least two drives.

PROs: CONs:
Very Fast.
Inexpensive.
No other hardware required.
Moderate shelf life
(Several years if properly cared for.)
Slightly fragile.
Vulnerable to usual drive failure issues.

• DVD-R / CD-R

Most systems have a DVD-R or CD-R drive, and the media is moderately priced, so this is an easy way to ensure that critical data is backed up.  But the limited storage capacity and limited shelf-life means that this is no substitute for a comprehensive backup strategy.

PROs: CONs:
Moderate cost.
Hardware already available in most systems.
Can keep old disks for redundancy.
Better than nothing for critical files.
Only a few gigabytes for DVDs, less for CDs.
Deciding what you REALLY want to backup takes time.
Media has limited shelf life (months to a few years).
Slow.

• Flash Drives

As flash drives become larger and less expensive, they are becoming a viable option for backups.  But their high cost per-gigabyte and limited storage makes them useful only for backing up critical files, much like DVD-R's or CD-Rs.  It is also not yet clear how long flash-memory can be safely stored and there are few if any options for recovering corrupted flash-memory.

PROs: CONs:
Small.
Portable.
Better than nothing for critical files.
Expensive per-gigabyte.
Limited capacity.
Deciding what you REALLY want to backup takes time.
Unknown shelf-life.

• Zip, Jazz, DVD-RAM, ORB, etc.

Do they even make these anymore?  Many removable media formats have fallen aside due to extremely poor reliability.  When in doubt, assume any removable media not otherwise mentioned here is simply too unreliable to use for backups.

• Tape Drive

Tape drives are the oldest storage technology still in (somewhat) common use and still represent the best option for large scale backups.  Before external hard-drives became so inexpensive, this was my preferred method for backing up even single systems.  Today, the cost per gigabyte of tape hardware and media is much too high to be a valid option for small or even medium scale users.  Tape today is mainly used with automated tape library systems where there is a requirement for long term archiving of the data.

The type of tape system REALLY, REALLY matters.  Cheap tape drives and media are worse than useless as they just give you a false sense of security.  For example, the drive head often becomes misaligned over time making it such that no other drive can read the tapes it has written!

Current reputable systems are Mammoth (Exabyte 8mm), DLT (Quantum), LTO (HP, IBM, Segate), and AIT (Sony).  For maximum media reliability and recovery, I recommend VXA (Exabyte, formerly Ecrix).  (VXA drives can recover data from some amazingly abused or neglected tapes.)  For best results, you need four sets of tapes: two for monthly full backups, and two for automated daily incremental backups.  Don't forget to clean your drive heads as often as the manufacturer recommends and buy new tapes every year or two.

PROs: CONs:
Phenomenal shelf life (decades if properly cared for).
Automated library systems for extreme capacity.
Easy to grow.
Very high initial cost.
Expensive media.
Slow.
Requires dedicated backup software.
Initial setup can be difficult.

• Online / Cloud

As network speeds have improved, the idea of backing up your system to online storage has become increasingly viable.  The main advantage and disadvantage are the same: someone else takes care of your data.  Speed and reliability are the biggest challenges, particularly since you will likely be locked into a single vendor.  Security should also be a concern.  Online storage makes the most sense when backups need to be made or accessed from multiple locations.  But for most small and medium scenarios, this slow speed and general risks make this option unattractive.

PROs: CONs:
No hardware hassle.
Available from multiple/any locations.
Moderate to low cost.
Easy to grow.
Slow.
Locked into the provider and/or their software.
Cannot control reliability.
Cannot control security.

Software

For Mac OS X, the choice of software had been radically simplified since the introduction of Time Machine: Just Use Time Machine.  For everyone else, or if Time Machine doesn't fit your needs, I'll include some information about other choices.

The main challenges in selecting backup software are making sure that it will work with your chosen hardware/media, and making sure that its method of recovery fits your needs.

Not long ago, the accuracy of backups was a big issue, especially for Mac OS X.  These days, few applications use the legacy HFS meta data that used to cause problems while nearly all backup software now supports it.  Ironically, Windows now has some of these meta data problems since the introduction of multiple file streams in NTFS.  Very few Windows applications make use of NTFS streams and even fewer backup applications support it.  I'm not aware of any, but it is an issue to be aware of.

The chart below compares features of several Mac-centric backup options.  I first made this chart back in 2006 when there were no good solutions for Mac OS X.  Many of the issues are no longer important due largely to the abundance of cheap external hard-drive storage.  Note that this chart is only considering system backups: if all you want to do is backup your own documents, then almost anything that claims to do backups will work.

 RetrospectTime MachineDisk UtilitySuperDuperditto /
Carbon Copy
Cloner
Live Backup
Able to backup the drive of a running system
YESYESNOYESYES
Bootable Backup
Backup drive is able to boot
NONOYESYESYES
Bootable Restore
Restored system disk is able to boot
YESYESYESYESYES
File Selection
Only copy files with specified attributes
YESLimitedNOLimitedLimited
Error Tolerance
Backs up what it can, even if some files have problems
YESNO?NOYES
Overall Reliable
Runs without crashing or freezing
NOYESEasily
Confused*
YESYES
Easy to Use
Obvious how to make it work
NOYESBackup: Yes
Restore: No
YESYES
Fast
Makes efficient use of hardware
NOYESYESYESYES
Compression
Able to compress backups
YESNOYESYESYES
Tape Drives
Writes directly to tape
YESNONONONO
Folder Dates
Preserves folder date stamps
YESFull Restore: Yes
Individual: No
YESYESSometimes
Locked Files
Preserves Locked/uchg attribute
NOYESYES*NONO
Pipes
Backs up "p" files
YESUnknownYES*NONO
Sockets
Backs up "s" files
NOUnknownYES*YESNO
* Disk Utility becomes easily confused, requiring that all targeted disks be unmounted and the program quit-and-restarted.  Making an image is very easy, and all the attributes are saved.  But restoring requires that you first "scan image for restore", then restore using "Erase Destination".

All of the utilities above preserve resource forks, permissions, user/group ids, and all the other stuff that is mandatory for accuracy.  But the only way to make a perfect backup is to dismount the drive and make a disk image using Apple's Disk Utility.  This is fine for removable disks.  It is not bad for a system that can be shutdown to target disk mode and imaged from a different Mac or rebooted from a system disk.  But for a stand-alone machine or a server that needs to keep running, that just doesn't work.  Time Machine is, by far, the best system for creating live backups of a working system.

• Time Machine Tips

Since 10.5, this is Apple's preferred method of backing up a system.  The phenomenal ease-of-use combined with inexpensive external hard-drives makes this a no-brainer for Mac owners.  You may have needs which could benefit from supplementing Time Machine with other options, but if you have a Mac, you should just get a couple of big USB drives, plug them in, and let it do its thing.

Time Machine works by first copying all the files on a volume to a special directory on the backup drive.  It will automatically skip certain system cache and temporary folders and you can tell it to skip others.  Do you have folders which usually contain large amounts of temporary data, such as intermediate video encodings, or do you use applications like Photoshop which create a lot of temporary files?  If so, make sure to click "Options" and exclude those folders.

After the initial backup, which can take a LONG time, only changed files are copied to the disk.  Existing files are preserved by using a unix filesystem feature called a hard-link.  This allows a file to simultaneously exist in multiple directories.  The result, as Time Machine runs over time, is that multiple folders are created each with a complete snapshot of the entire drive.  You can actually browse these files without Time Machine, which can be very handy if you just need to take a quick peek or want to search for something without going through Time Machine's hefty interface.

I could write a book about Time Machine's quirks: it has a lot of them and it can be maddening to use at times, but here are a few tips to help you avoid most problems:

  • Decide now whether you will use direct or network attached drives.  Network Time Machine backups use a different format than direct attached backups and you cannot convert between them.  Network backups are convenient if you have several computers and don't want to juggle multiple external drives for each, but they are slow.
  • To use a network volume for Time Machine backups, simply mount that volume via AppleShare and then select it as the destination in Time Machine.  This works very well with Airport Base stations.
  • I recommend against using Time Capsule: a hard-drive built-in to an Airport Base station.  You are better off getting a plain Airport Base station and connecting external USB drives.  That way you can rotate, upgrade, and move them as needed.
  • Make sure the drives are significantly bigger than the data you are backing up.  There is no compression with Time Machine and any change to a file (even just changing its name) will cause a new copy to be written to the backup drive.  More space on the backup drives means Time Machine will keep older archives for longer.
  • For multiple systems, make sure they all get backed up regularly.  When a backup disk fills up, Time Machine will try to recover space by deleting the oldest backups... but ONLY from the volume it is currently backing up.  So if one system is hogging all the space, others may get short-changed.  This also means that if you get rid of a volume that you had been backing up, you will need to manually remove its backup image to reclaim its space.
  • For direct attached drives, use USB instead of firewire or eSATA.  In addition to being cheaper and more compatible, it will slow down the live backups and reduce their interference with you using the main drive.
  • Time Machine performs backups hourly or manually, nothing in between.  For direct attached USB storage, the hourly backups should not noticeably interfere with your use of the computer.  But if you are working with very large files or are using network attached backup disks, it may cause unacceptable slow-downs.  Third-party utilities like Time Machine Editor can be used to set custom backup schedules.  However, custom scheduled backups sometimes won't run if the computer does not happen to be on at the scheduled time, so make sure to check the Time Machine menu item and do manual backups as needed.
  • If you are using remote desktop, screen sharing, or VNC then you may have trouble with Time Machine's recovery interface.  If you find that you can't see any of the graphics, select a file and type Command-I.  Then drag the info window around the screen.  This will force the elements to refresh and give you a chance to interact with them.
  • If Time Machine encounters errors, the backup or restore operation simply stops.  Fortunately, this doesn't happen often and its pretty in-your-face with the error messages.  But if you are running Time Machine on a remote or unattended system, you will need to check on it often to make sure it is still working.
  • If Time Machine reports file errors and is repeatedly unable to complete a backup, use the Console utility to look in the system.log and identify which files are causing problems.  Try renaming the files or the folders containing them and run the backup again.  This will force Time Machine to recopy the files instead of trying to link to the old ones, which can resolve some sticky problems.

The biggest disadvantage to Time Machine is that the backup instances are not bootable.  If you want to create a bootable image, I recommend SuperDuper or Carbon Copy Cloner in addition to Time Machine.


Recommendations

For a home user who would not otherwise do backups at all and has no money:
Decide which files are most important to you and drag them to new CD-Rs or DVD-Rs every month or so.
For the average home user whose data is worth spending a few bucks to preserve:
Use Time Machine and rotate two USB hard-drives, if you have a Mac.
If you are stuck with Windows, try Retrospect.
What I use at work:
Time Machine to direct attached USB hard-drives, rotated monthly.
Time Machine Editor limits server backups to three times per day.
What I use at home:
Time Machine to an Airport Base Station with USB hard-drives, rotated monthly.
Time Machine Editor limits server backups to once per day.