HGM - distributed maildirs

ben / tech / hgm

What?

HGM allows you (me?) to synchronise maildir collections, using git. I use it to synchronise mailboxes on a home machine and my laptop, allowing me to read/write/delete/move email on either machine. I plan to add a third host Real Soon.

Who?

Ben Clifford

Where?

Source code is stored in git. There is a web interface.

Stability

I use it on my main mailbox, with some nasty shell wrappers that I will not release to the world. You probably shouldn't use.

How?

git is used to store maildirs and to ship the messages between git repositories (which may live on different hosts / storage devices), in much the same way that it is used on source trees.

hgm is a porcelain around git adding maildir-specific handling - a maildir-aware merge strategy and (in the shell script wrappers at the moment) code to keep the checked-out maildirs in sync with the git index without having to add/remove messages manually.

limitations

folder depth

race

nasty shell script wrapper

Other people

Other people have had this idea before. Read the thread.

And here are some scripts that have a similar objective


move-merge / maildir-merge for git

A merge optimised for maildir-like scenarios, where files are often moved, new ones added, old ones deleted, but the contents are rarely/never changed. This kind of move should be able to work pretty much just in the index - we should never be generating new content with this merge method. Diffs instead of being content-based should be the other way round -- for each content-hash, see what has happened to the filename of that content hash (i.e has it moved, been deleted, been added) and merge those changes to give the merged result.

The kind of conflicts that will arise are, I think, firstly when a given piece of content has been renamed to multiple names, and secondly when two different pieces of content want to use the same name. In the former case, maildir-merge can add special case resolving to deal with some of the cases (eg. combining flags etc to generate a new name (which might be a name that does not exist in either of the source trees)); its probably OK for now to bail out if we can't deal with any of the other cases easily, and insist that there be a manual fixup of the branches to be merged to make them mergeable with this method.

maildir-merge should be implemented on top of move-merge, with the added value being a set of conflict-resolvers that are aware of maildir-style filepaths.

This is how I'm using maildirs at the moment. The purpose is to get a decent distributed mailbox, that can be modified 'simultaneously' in different locations without network connection (specifically, I have a laptop and an always-net-connected server; usually mail is delivered to the always-net-connected server, but I read mail on a combination of the laptop and the always-net-connected server, depending on my situation). I'd like this to be able to scale up to allow ~10 machines to participate in the pool (for example, one or more work PCs, a home PC, potentially multiple delivery servers). No one particular host should be the master; this means that any host can die and everything else can carry one with little or no reconfiguration.