Many programs exist for creating backups of your important data in different ways. Common methods include backing up an entire partition of your harddrive to some image format, or the simple copying of files, both incremental, differential and full backups.
I am not interested in making a complete backup of my entire drives, mostly because it would require too much space, so that option is out. I want to create backups of a specific set of files and folders from my harddrives.
I also want rotating backups. Back in the days, when tapes were used for backups, it was common to use eg. one tape for Mondays, another for Tuesdays the next for Wednesdays and so on, until reaching Monday again. Then the tape from last Monday would be reused, hence creating a rotation of seven backups.
The advantage of this is that you will have access to the data prior to any modifications that happened up to seven days ago, which is very advantageous since you may not notice the corrupt or missing data as soon as it appears. And of course you could make backups with different frequencies, whatever suits your needs.
To make a backup, there are essentially three methods available:
The full backup is exactly what it sounds like. Copying all the files and folders from one drive to another. The drawback of this is obvious, space requirements. If we want a rotation of seven snapshots, we need seven times the space of our original drive. It will also take a long time to complete the backup, since every file needs to be copied, whether it has changed since the last backup or not.
Incremental backups starts by making one full backup. For every backup after that, only the files that has changed since the last backup are copied. This gives you optimal speed and space requirements. However, restoring files from this scheme is a hassle. To restore a backup completely you need to copy files from all incremental backups since the full backup was last made.
Differential backups work like incremental backups, but they copy all files that have changed since the last full backup. This means you only need to access the last differential backup in addition to the last full backup to make a complete restore. However, space requirements increase, since no consideration is taken to the files backed up last.
The Desired Functionality
The functionality I want from a backup system is to have access to the entire filesystem as it looked at the time of backup, from one location, in addition to having the rotation scheme described above. In essence, I want it to look like I have made X number of full backups, but I want the space requirements of the incremental backups.
I also want to store my backups on another harddrive. It is probably the cheapest way to make backups today, considering the very low prices on harddrives.
I found a great solution to this a while back in the following post: http://www.mikerubel.org/computers/rsync_snapshots/, using the unix-tool rsync and hardlinks.
Originally I wrote a perl-script and used it on a linux machine to run rsync to pull files from an SMB-share on my windows machine. However, this was not a very good solution, because I would have to set up a share for each folder I wanted to backup, and maintaining that became too much of a hassle.
The next step was the more sensible one, to use rsync on the windows machine (it is available as a commandline application for windows as well), and have it push the files onto the linux machine using the rsync-daemon on linux.
Of course, I could use rsync locally just as well, to copy files from one harddrive to another, eg. an external one and have all the functionality with hardlinks etc.
However, I noticed that rsync tend to mess up file permissions and owners on NTFS system at times, and I have not been able to find a solution to this.
What is worse, whenever I forget to eg. close Outlook when making the backup, it fails to copy the outlook.pst file, because it is open. The same thing happens with a number of other files.
The Windows Volume Shadow Copy Service
The solution to this last problem has nothing to do with rsync, but rather with windows itself. Enter the Windows Volume Shadow Copy Service, or VSS for short (not to be confused with Visual Source Safe). It allows you to create a snapshot of a harddrive, the way it looked at an instant in time, and from that you can copy all the files you want. A very good article describing a method for using this is http://mutable.net/blog/archive/2006/11/13/an-intelligent-backup-system-for-windows-part-1.aspx.
That article actually provides me with the exact solution that I want pretty much. But being a programmer, of course I want to write my own solution, a nice application with a GUI, that is simple to use, provides powerful filtering capabilities (similar to those of rsync), uses VSS when possible etc, etc…
Writing such an application in .NET would be nice. But as mentioned in several posts on the web, VSS is not readily available in the .NET framework. And, even worse, normal P/Invoke interop doesn’t seem to work either. There is a long discussion about this here: http://www.pluralsight.com/community/blogs/craig/archive/2006/09/20/38362.aspx
One solution would be to use a command line interface to the vshadow.exe sample application, but that is not really a good solution for creating a robust application.
So, I decided to create a custom wrapper for the VSS service in managed C++, exposing its interface to .NET. Actually I tried this once before and failed, but this time I seem to have been successful. I will post the code here shortly. It is not very thoroughly tested yet, but everything I’ve tested so far seems to work. And it exposes pretty much the entire VSS API! So bookmark this site, and/or leave a comment if you find this interesting!