snaprotate - delete rsync-based snapshots

ben / tech / snaprotate

about

snaprotate is a tool and a configuration language for deleting old directory trees made with rsync.

Every few hours, a script runs to make a new backup of my systems, using rsync... but I don't want to keep them forever.

I could have downloaded some tool to handle this for me, but it seemed more fun to build one myself. The results is snaprotate.

The language

My main goal was to make a combinator language for specifying expiry policies. This is embedded within Haskell rather than being a fresh langauge written from scratch. Here is an example:

#!/usr/bin/env snaprotate

policy = recent <||> weekly <||> monthly

recent = (keepLastWithinDuration twodays) <?> "keep last 48h"
weekly = keepOnePerWeekLast4Weeks <?> "one per week, last month"
monthly =  keepOnePerMonth <?> "one per month, forever"

twodays = 2 <*> day
week = 7 <*> day

keepOnePerMonth = keepOneEvery month

keepOnePerWeekLast4Weeks = keepOnePerWeek <&&> keepLast4Weeks

keepOnePerWeek = keepOneEvery week
keepLast4Weeks = keepLastWithinDuration (4 <*> oneweek)

The only required part is the policy definition. This is the policy that snaprotate will evaluate. It can be defined in terms of other definitions within the file (such as keepOnePerMonth above). Policies define which snapshots will be kept, based on their timestamp. Any snapshot which is not kept by the policy will be deleted.

The above policy uses the following snaprotate specific functions and operators:

keepLastWithinDuration duration
Keeps all snapshots that were made within the specified duration into the past from the time of execution.
keepOnePerTimeFormat format
Puts snapshots into buckets defined by the supplied format string (which is documented in Data.Time.Format) and keeps the oldest snapshot in each bucket.
policyA <||> policyB
Keeps a snapshot if either policyA or policyB keeps that snapshot.
policyA <&&> policyB
Keeps a snapshot if both policyA and policyB keep that snapshot.
policy <?> description
Adds a description to a policy to give more informative output. This does not change which snapshots are kept. It is not necessary to use <?> to add an explanation - an automatically generated explanation will be given otherwise.
day
A duration of one day.
month
A duration of one month.
n <*> duration
A duration n times as long as the supplied duration.

execution

After putting the snaprotate directory onto $PATH and running Make, snaprotate policies can be run directly from the directory where the rsync snapshots live:

$ cd /snap
$ ls
[...]
home-2010-04-16-0709+0000  home-2010-04-29-0709+0000  home-2010-05-06-2309+0000
home-2010-04-17-0709+0000  home-2010-04-30-0709+0000  home-2010-05-07-0709+0000
[...]
$ ~/snapconfig/dildano.snap -b home
snaprotate, Copyright 2010 Ben Clifford benc@hawaga.org.uk
# keep: home-2010-05-05-1509+0000: keep last 48h
# keep: home-2010-05-05-2309+0000: keep last 48h
# keep: home-2010-05-03-0709+0000: one per day, last month
# keep: home-2010-05-04-0709+0000: one per day, last month
# keep: home-2010-05-05-0709+0000: one per day, last month
# keep: home-2010-03-29-2309+0000: one per month, forever
# keep: home-2010-04-01-0709+0000: one per month, forever
# keep: home-2010-05-01-1509+0000: one per month, forever
rm -rf home-2010-05-04-2309+0000
which gives a bunch of shell comments explaining why each kept snapshot was kept, and then one or more rm commands to delete expired snapshots. Note that snaprotate doesn't actually run the commands - that can be achieved by piping the output into sh

download

prerequisites: ghc (type something like: apt-get / yum / port install ghc)

Then:

git clone git://github.com/benclifford/snaprotate.git

Licence: BSD-like

github

snaprotate has a github project: http://github.com/benclifford/snaprotate

feedback

I welcome any feedback (positive or negative or even just to say that you actually downloaded and used it): benc@hawaga.org.uk