Have you ever wondered what goes on with CrashPlan behind the scenes? In this article, we'll take a closer look at exactly what happens when CrashPlan is backing up.
Let's start with an example scenario where CrashPlan is using its default settings and is backing up your user home directory.
CrashPlan constantly watches for new and changed files within your home directory with what we call the real-time file watcher. It adds new and changed files to a TO DO list.
Let's say that you created a document called “Letter to Grandma.doc”. The real-time file watcher sees that you've created this document and adds it to the TO DO list for backup.
This is what happens when CrashPlan starts backing up Letter to Grandma.doc:
This process repeats for the next block within the file until CrashPlan has analyzed and backed up the entire file. In this way, only unique information is backed up, which saves bandwidth and storage, and makes restoring faster.
Tech Note: Data is securely encrypted throughout this process.
Tech Note: Data de-duplication occurs on each computer. If you have the same file on two different computers, the file will be backed up twice - once for each computer.
As you're working on your letter and making changes, CrashPlan's real-time file watcher sees that the file has changed and CrashPlan puts the file back into the TO DO list. If your letter is 1 MB, 1 MB is added to the TO DO list. Only the changes are actually sent to the destination, however, not the entire file. The changes are backed up while you work, creating a new version of “Letter to Grandma.doc”.
In this example, you've added a paragraph (highlighted in red):
Tech Note: A new version of the file is backed up every 15 minutes. This interval is controlled by the New Version setting. We recommend keeping this at the default of 15 minutes in most cases.
There are actually two ways that CrashPlan learns about new files or changes to your existing files:
With two methods of identifying file changes, your files are doubly protected - CrashPlan checks for changes twice to make sure your files are backed up.
The real-time file watcher works directly with the tools built into your computer's operating system, which means it is fairly lightweight and can easily work in the background without you noticing.
The scan requires a bit more resources than the real-time file watcher, so to minimize possible impact on your computer, the scan runs at 3 am every day by default, when most people are less likely to be working at the computer.
Tech Note: On Mac, CrashPlan detects that you've deleted a file or files with the scheduled scan. On the other platforms, deleted files are detected in real-time.
Of course, you probably have more than one file on your computer that you'd like backed up. CrashPlan can't back up all files simultaneously, so how does it decide what to work on first? We designed CrashPlan to back up the newest and most recently changed files first. This ensures that the most recent versions of your files - what you're working on right now - are backed up as soon as possible. The priority order looks like this:
Whenever a file is added or modified, CrashPlan adds it to its backup to-do list. Changes are added to the backup based on the frequency at which CrashPlan backs up new versions, which by default is every 15 minutes. You can change this default setting under Settings > Backup).
Tech Note: If you have very large files that change frequently (such as multiple-GB virtual machine disks) and it seems like backup never completes, try creating a backup set for those large files with a longer New Version interval. This gives CrashPlan more time to back up other files before it needs to back up the changes within the very large file or files.
We always recommend that you back up to multiple backup destinations for fastest backup and restore and for best protection, but exactly how does multi-destination backup work?
CrashPlan prioritizes backup activity to ensure all your selected files are completely backed up at one destination before starting backup to another, redundant destination. To accomplish this, CrashPlan backs up to destinations which it determines should complete fastest:
For example, if you are backing up to a local folder and to a friend's computer, CrashPlan completes backup to the local folder before backing up over the Internet to your friend's computer. Once backup to a destination completes (or if that destination becomes unavailable for any reason), CrashPlan backs up to the next destination.
Tech note: You don't have to wait for the entire backup to complete to restore a file. As soon as a file is backed up and appears on the Restore tab, it is available for you to restore.
If you choose to enable Backup Sets, you can specify backup priority. The goal is still to back up all your files to at least one destination first. Then, CrashPlan works on redundancy to back up your files to additional destinations. As long as one destination in the set is complete, CrashPlan moves on to back up other, less complete sets. When you have Backup Sets enabled, CrashPlan follows these rules:
The backup process always works the same, whether you have one computer or ten computers in your account.
CrashPlan treats each computer within your account separately and each computer's backup is stored separately at each destination. Each source computer stores its backup in its own folder. The folder name is the source computer's ID.
In this example, John and Michelle are backing up to Chris. This is what it looks like on Chris' computer:
Storing each computer's backup in its own folder enables seeding, which is the process of backing up locally and then physically transporting the backup drive to an offsite destination, such as a friend across town or to our data centers. Seeding is a great way to kick-start a large backup to an offsite destination and saves on bandwidth because the backup does not go over the Internet.
Occasionally, CrashPlan's data de-duplication needs to re-scan your files to see what's already been backed up. When this happens, it looks like CrashPlan is backing up all your files from the beginning, but it is actually reviewing each block to see what's been backed up already. How can you tell that this is the behavior you're seeing?