Notes on A Plumber’s Guide to Git

On Tuesday, I ran my workshop A Plumber’s Guide to Git for the Cambridge Python User Group. I’ve also run it at PyCon UK and in my workplace. On all three occasions, it’s been very popular and I’ve heard people find it useful – on Tuesday, we actually ran out of space in the room! In an attempt to make it more accessible, I’ve written up the entire workshop and posted it on my site.

The aim of the workshop is to understand of the underlying concepts of Git. We learn exactly what happens in a typical Git workflow (add, branch, commit, and so on). We look inside the .git directory, and explain what Git stores internally to make those workflows happen.

If that sounds interesting to you, start reading the introduction, and continue from there. I’ve uploaded everything I use in the workshop – my notes, whiteboard sketches, and exercises. Typically it takes about two hours to complete. Enjoy!

The Hypothesis continuous release process

About a year ago, David built a powerful continuous release system for the hypothesis-python repo. If you push to master with a release note, our CI bumps the version, updates the changelog, tags a new version on GitHub and uploads a new release to PyPI. With no manual intervention.

This sort of change permanently ruins you. I’m now so used to continuous deployment, I get annoyed when I work on projects that don’t have it (and have thus copied it to several other repos).

If you’re interested, I wrote about the process on the Hypothesis blog – how it works, why we do it, and why we find it so useful.

Keep an overnight bag in the office

For the last few days, there’s been a lot of bad weather across the UK. The Beast from the East has caused snow, subzero temperatures, and problems on the transport network. Many trains have been delayed or cancelled, and hundreds of drivers have been trapped in their cars. The Met Office have even issued rare red snow warnings, meaning possible loss of life.

In such bad weather, the best thing to do is to avoid travelling. If you usually work in an office, it’s better to work from home or skip work entirely – or if you decide to go in, leave plenty of extra travel time. Sometimes you can’t avoid going to work, so here’s a bit of advice:

Keep an overnight bag in your workplace.

It doesn’t have to be large, but enough that you could sleep in or near your workplace if you really had to. For me, the limit is “could I walk home quickly (less than 15 minutes) in foul weather” – if not, I assume I might get stuck at the office one day, and prepare accordingly.

Is that fun? No. But in weeks like this, it may be safer than trying to make the journey home.

The sort of things you might want:

Some of these are things I carry anyway – for example, I usually have paracetamol in my bag – but I like to have a ring-fenced supply just in case. Sometimes, you can venture out for supplies – but depending on the weather and where you work, that may not always be possible or safe. In both cases, it’s much easier to have it pre-packed.

I’ve had a bag like this for years; so far I’ve only used it once. During Storm Doris last year, all the train lines from London to Cambridge were knocked out, and I slept on the floor of my office. (It helps that we have showers!) And although I’ve never used it since, it’s been reassuring to know that it’s there if I need it.

This is the sort of thing you hope you never have to use, but when bad conditions strike, it’s really nice to have it handy.

A working from home experiment

For the last year, I’ve been commuting from Cambridge to London. The office is near King’s Cross station, so on a fast train with no delays, it’s about 90 minutes each way – or about 15 hours of commuting a week. Turns out, that’s quite a lot!

About a month ago, I got home on Thursday evening, and I was exhausted. I’d had severe train delays that week, and I was crying tied. I worked from home on the Friday, and I finally realised that the amount of commuting I was doing was unsustainable for my long-term health. Something had to change.

With agreement from everyone at work, and encouragement from David and Camilla on Twitter, I’ve started working from home for one day a week. I have a nice home office, and my team is set up to support remote development. I’m very lucky – from a practical standpoint, it’s easy for me to work from home.

What are the benefits?

There are several reasons I want to try this.

I’ll spend less time commuting. Just the free time I get back from an extra day at home each week is significant. Over the rest of this year, I’ll get back five days of time. That’s time to do other things – cook a nice dinner, spend time with friends, work on a personal project – that I’m often too tired to do after three hours spent on trains.

It breaks up my week. I’m working from home on either Tuesdays or Thursdays, so I never have to spend more than three consecutive days on a commute. On the off days, I can get some extra sleep, and still start work at the same time.

It’s not as disruptive as moving to a new flat. I’m often asked “Why don’t you move to and/or closer to London?”, which would be another way to reduce my commute. But I’m happy in my current flat, and moving would be a big upheaval compared to a bit of remote working. Maybe I’ll be forced to move to London eventually, but I want to try this first.

I want to try regular remote working. All of my jobs so far have been on-site jobs in offices, but a lot of companies are work entirely remotely, with no office. The two lifestyles are very different, and I’ve never tried the latter. This is a way to dip my toe in the water without total immersion, so I can decide if I want to go fully remote in a future job.

Not being in an open office. Like a lot of places, Wellcome has an open office, which I know some people find harder to work in. I don’t usually feel like an open office is a problem for me, so the quiet working space is less of a benefit – but it’s another reason to consider working from home.

What are the risks?

Before I started, I knew this would make some things more difficult. These are the challenges I was more worried about upfront.

The impact on my mental health. When I’ve worked from home before, it’s often been at short notice – for example, when the trains to London aren’t running. These days consistently end with me feeling miserable. I think it’s a combination of the unpleasant surprise, and the loneliness of unexpectedly spending the day on my own.

I’m hoping that if I plan it in advance, I can avoid some of that. Being at home will no longer be an unpleasant surprise, and I can make plans to see people for lunch or in the evening.

Communication with the rest of the team. The rest of my team (currently) spend five days a week in the office, and we have a lot of in person conversations. It’s easy to keep up with what we’re all doing. When I’m in the office, I’ll lose that visibility and the casual conversations. We’ll need to make more use of tools like email, Slack, and GitHub to stay in touch.

I don’t end up taking the day at home. Right now, I don’t have a fixed day to work from home; I’m choosing it on a week-by-week basis. Because it’s not a regular fixture in my calendar, there’s a chance I may end up skipping it, which defeats the point – I need to be disciplined about booking in that day, and following through with it.

So far

I’ve been doing this for a couple of weeks, and already I feel much better. I’m less tired at the end of the week, and I have less dread about commuting on the days when I do. If you can work from home for a day or so every week, I’d recommend giving it a try.

As with all big changes, this probably came three to six months after I really needed it, but I’m glad I’m doing it now.

Getting every message in an SQS queue

At work, we make heavy use of Amazon SQS message queues. We have a series of small applications which communicate via SQS. Each application reads a message from a queue, does a bit of processing, then pushes it to the next queue. This is a classic microservices pattern.

Three applications, communicating via two message queues.

Sometimes an application fails to process a message correctly, in which case SQS can send the message to a separate dead-letter queue (DLQ). (Our Terraform module for SQS queues automatically creates and configures a DLQ for all our queues.) Sending faulty messages to a DLQ allows you to see them all in one go, rather than trying to spot the failures in your logs.

Unfortunately, the AWS Console doesn’t make it very easy to go through the contents of a queue. You can see one message at a time, but this makes it hard to spot patterns or debug a large number of failures. It would be easier to have the entire queue in a local file, so we can analyse it or process every message at once. I’ve written a Python function to do just that, and in this post, I’ll walk through how it works.

Read more →

Listing keys in an S3 bucket with Python, redux

A few months ago, I wrote about some code for listing keys in an S3 bucket. I’ve been running variants of that code in production since then, and found a pair of mistakes in the original version.


Since that post has been fairly popular, I thought it was worth writing a short update. In this post, I’ll walk through the changes I’ve made in the newer versions of the code.

Read more →

IP and DNS addresses for documentation

If you’re writing documentation that includes IP addresses, you may want to check out RFC 5737 and RFC 3849, which specify IPv4 and IPv6 addresses for use in documentation.

These addresses are “reserved”, meaning they should never be used for anything else – not on the public Internet, nor within internal networks. That means you can use them in examples, and they should never conflict or be confused with real systems.

Here’s RFC 5737 for IPv4:

The blocks (TEST-NET-1), (TEST-NET-2), and (TEST-NET-3) are provided for use in documentation.

and RFC 3849 for IPv6:

The prefix allocated for documentation purposes is 2001:DB8::/32.

In a similar vein, RFC 2606 provides a number of TLDs and domain names for use in documentation – for example, .test and Again, the idea is that these are reserved for documentation, and will never start resolving to a real system at an unknown point in the future.

Of course, you can use any IP address or DNS name in your docs, but if the exact values are unimportant, you may want to consider using these reserved blocks. They’re good placeholder values, because they can’t be mixed up with anything else.

These RFCs have come up several times in the Write The Docs Slack, which is why I decided to create a more permanent signpost. If you care about technical writing, you may want to join the Slack, where this sort of thing is often discussed – sign up through the WTD website.

Your repo should be easy to build, and how

Whenever I look at a new repository, I have a simple smell test: how long does it take me to clone, build and get the code running?

Here, I’m usually counting the steps I have to do, the commands I have to run. The clock time is less important (although fast builds are still nice!). Ideally, there’s a single command which takes me from a fresh checkout to a complete build — and without me having to fiddle with too many dependencies first.

Once I have a working build, I can start fiddling with code and find my own way around. Getting that first build is key.

Making it easy to do a clean build has many benefits.

The obvious one is time saved — I run one command, then I can walk away while the computer does all the slow bits. Downloading dependencies, compiling code, setting up the local environment, that sort of thing. It might take a while to finish, but I don’t need to supervise it while that happens. I can spend that time doing something more useful.

It’s also more reliable. Remembering “make build” is easy. Remembering eight calls to different shell scripts, and their associated arguments, is much harder. If the build is simple, there’s less to get wrong, and it’s more likely I’ll get it right first time. Automating the build process makes it faster and more reliable.

And finally, first impressions count! Being able to start working quickly is a pleasant experience. If writing and testing my first patch is easy, I’m more likely to do it a second time. And a third. And so on. This is particularly important in open source repos where patches often come from people giving up their time for free.

In the last year, I’ve spent a lot of time simplifying my build processes, both in my work and my personal repos. Most of my current repos now have a single-step build. It’s not perfect, but I’m very pleased with the results.

In this post, I’ll explain my typical setup, and how I use Make and Docker to get fast and reliable builds.

Read more →

Armed police officers don’t make me feel safer

Content warning: discussion of guns, police violence, and images of armed police.

I live in the UK. We have fairly strict laws around owning guns, and you’re unlikely to encounter a gun except on an army base or in a work of fiction.

As a result, I’m fairly sceptical of guns, and I find the sight of them unnerving. I was on holiday in Berlin recently, where most of the police officers on the street were carrying handguns, and I was jumpy whenever I passed them. The same applies when I’m in airports. I feel like this is a healthy reaction.

On occasion, I do encounter armed police officers in the UK, and it’s never pleasant.

I have no idea whether armed police actually make me safer — I’m not a crime expert, and I don’t know what gun crime statistics are in the UK. But seeing a gun is so unusual, that when I see armed police I feel less safe. Being shot at is not something I worry about in day-to-day life, except when I see an armed police officer. (Although I do know that Police Federation surveys continually show resistance to routine arming by police officers, who I’d expect to know.)

And I’m a cis white man, a group that isn’t usually profiled or targeted by police. Other people probably find it much scarier.

A month or so ago, I was waiting for a train at King’s Cross, when two officers carrying large semiautomatic guns suddenly appeared behind me. They walked straight past me, but I was briefly terrified. Another officer came up to me to explain that this was an “awareness campaign” to “make me feel safer”, even though it had the opposite effect. I politely explained this to the officer, who didn’t want to see my point of view.

On that occasion, I got on my train to Cambridge, and left the guns behind. Unfortunately, it’s followed me home.

Read more →

Pruning old Git branches

Here’s a quick tip for Git users: if you want to delete every local branch that’s already been merged into master, you can run this command:

$ git branch --merged master | egrep -v "(^\*|master|dev)" | xargs git branch --delete

A quick breakdown:

I originally got the command from a Stack Overflow answer, although I tweaked it when I read the documentation, to more closely match my use case.

If you want to see what branches this will delete without committing to it, run everything before the second pipe — not the xargs bit at the end.

The other command I often use is this one:

$ git fetch origin --prune

If a branch has been deleted in the origin remote, and you had a local branch which was tracking it, the local branch gets deleted as well.

For example: suppose you had a branch called new-feature. You push the branch to GitHub, open a pull request, and later the branch gets merged and deleted through the GitHub web interface. When you do your next fetch with --prune, it’ll clean up the local branch new-feature.

Git branches are very cheap — usually a single file that references a commit hash — so deleting branches won’t save disk space or improve performance. I like to keep my repos neat and tidy, and not have a long branch list to scroll through, which is why I do this. If a long branch list doesn’t bother you, then you can ignore these commands.