This blog is highly personal, makes no attempt at being politically correct, will occasionaly offend your sensibility, and certainly does not represent the opinions of the people I work with or for.
Aion - Backup Evolved

The last time I talked about backups, it was just after I had made an improvement to the Macintosh's Time Machine. Today, I am going to present aion, a procedure I designed last summer (well before my improvement of Time Machine), but that, for some reasons, I wasn't using, and that I will be using from now on...

First, let me remind you of the specifications Time Machine had to meet

  1. Provides dated snapshots of Galaxy.
  2. Use content addressable storage.
  3. Snapshots readily available as regular file tree (No unpacking of the data required to consume it, and not even through a virtual file system).

The little problem that I ran over using time machine (Pascal version) is that despite using space efficiently (hard linking to a datablock repository), it had the two following problems (only the second one really matters):

  • It does consume space at each snapshot (since the directory nodes themselves -- lots of them -- do take place on the drive)
  • More importantly, backing up Venus (my primary backup drive) onto Venus Prime (the backup of Venus) does take time, and in fact takes more and more time. Indeed, I cannot just copy over the newly created folders, I need to use rsync to make sure that the hardlinks are preserved (otherwise I would just blow up space on the target drive, defeating the whole purpose of hardlinks). But, from the point of view of rsync, despite the inodes being few (relatively speaking), there are still lots of files to process, an increasing number of them, few hundred thousand more files everytime I make a new snapshot.

Aion meets condition 1, meets condition 2 (and as we will see use spaces even better than time machine) but breaks condition 3. Breaking condition 3 is, in fact, fine for me since we are talking about archives, and not my true backup; this latter being your regular rsync from the laptop to the drive (at another location than where time machine keeps its data). On the other hand, aion, doesn't need to use hard links.

Aion manipulate three kind of objects

  1. Datablocks, up to 1 MB. They are stored the exact same way that time machine does: datablocks stored as files named after their hash (sha1).
  2. file object: They are JSON object of the following grammar
        "aion-type"      => "file",
        "version"        => 1,
        "name"           => (string),
        "size"           => (integer),
        "hash"           => (string),
        "contents-parts" => (string)(s)
    The only non obvious part is "contents-parts", an array of strings (this is what "(string)(s)" means...) that are nothing else than hashes of 1MB portions of the file. When you want to rebuild a file from a file object, you take the list of hashes from this array, extract the datablocks, put them together, and name the resulting file after the "name" key. The "size" key is the size of the complete file, and the "hash" key is the hash of the complete file (this allows you to control that you have correctly rebuilt the file from its fragments). Note that once the file object is created, it is then serialised and stored in the datablock repository along any other file binary data. This last fact is also true for the next object...
  3. Directory object: They have the form
        "aion-type" => "directory",
        "version"   => 1,
        "name"      => (string),
        "contents"  => (aion-object's hash)(s)
    Here again, the non obvious part is the "contents" key which contains the hashes of the contents of the directory. Those hashes are hashes of file objects or directory objects.

Here is the piece of code which builds directory objects (I know, recursion is so beautiful...)

process_location calls build_and_commit_file_object or build_and_commit_directory_object depending on whether the location is a file or a directory.

At this point, something should light up in your mind :-) Yes, you are right! Starting from the root of the folder I want to archive (namely Galaxy/), I recursively compute directory objects and file objects (that refer to each other); and they are stored along with the files' binary data.

Something else should light up in your mind: if I do the same operation twice, nothing changes in the repository.

Then, at this point, here is the last question: How do I refer to a snapshot ? Easy. Everytime a snapshot has been done (file objects and directory objects computed from leaves/files and the root object has been computed and stored), I get a hash (the hash of the root object). I then store this under the current date. For instance, I have a file called snapshot-20140309212319, which contains the string sha1-c516c5a1992282b68086d648fa5ead171d3e7758, and that is all I need to rebuild the snapshot I made yesterday (on 9th March at 21:23).

Here is the root (directory) object referenced by sha1-c516c5a1992282b68086d648fa5ead171d3e7758

And now you know that Galaxy is a directory that has 6 elements, but you do not know (yet) whether they are files or folders.

If two snapshots are identical, then they use the exact same data and no data is created during the making of the second snapshot. But if I change a file and make another snapshot, the file object will change as well as the directory objects along the path from the file to the root, but nothing else will change.

While being at it, if you cold store your snapshots hashes and later on rebuild a snapshot, the fact that the rebuild completes without error (no missing files and no datablocks that do not meet their own hash/name), also proves to you that what you just rebuilt is exactly what you thought you had stored (unless somebody managed to make cheap sha1 collisions). Git has the same property actually... Also note that there is a version of aion where the datablocks are encrypted (you submit an encryption key when you start a snapshot), but I don't really use that feature. (When it comes to my archives I am more sensitive about integrity than encryption).

Last but not least. When I start deleting snapshots, which really means deleting files with 40 characters strings inside them, how do I garbage collect ? Easy: Start from the snapshots and recursively look for file objects and records the hashes of datablocks that appear inside them. Then go through the datablock repository and delete the ones you no longer need.

[ add a comment ]