Skip to main content

Storing multiple, human-readable versions of BagIt bags

I’ve written some more about Wellcome Collection’s new storage service – this time, one of the implementation details.

Our unit of storage is a “bag”, stored in the BagIt packaging format. A bag is a collection of related files: for example, all the digitised images from a single book. We want to be able to update and modify bags after they’re originally stored, and further, to keep a history of every distinct version of a bag. The BagIt specification doesn’t support versioning of bags, so we had to come up with our own design.

My latest post on our development blog explains how we do the versioning, in a way that ensures human-readability and understandability. Even if you’re not using BagIt, versioning is an evergreen question in data management, and you might find some ideas that apply to your use case.