Whenever I look at a new repository, I have a simple smell test: how long does it take me to clone, build and get the code running?
Here, I’m usually counting the steps I have to do, the commands I have to run. The clock time is less important (although fast builds are still nice!). Ideally, there’s a single command which takes me from a fresh checkout to a complete build — and without me having to fiddle with too many dependencies first.
Once I have a working build, I can start fiddling with code and find my own way around. Getting that first build is key.
Making it easy to do a clean build has many benefits.
The obvious one is time saved — I run one command, then I can walk away while the computer does all the slow bits. Downloading dependencies, compiling code, setting up the local environment, that sort of thing. It might take a while to finish, but I don’t need to supervise it while that happens. I can spend that time doing something more useful.
It’s also more reliable. Remembering “make build” is easy. Remembering eight calls to different shell scripts, and their associated arguments, is much harder. If the build is simple, there’s less to get wrong, and it’s more likely I’ll get it right first time. Automating the build process makes it faster and more reliable.
And finally, first impressions count! Being able to start working quickly is a pleasant experience. If writing and testing my first patch is easy, I’m more likely to do it a second time. And a third. And so on. This is particularly important in open source repos where patches often come from people giving up their time for free.
In the last year, I’ve spent a lot of time simplifying my build processes, both in my work and my personal repos. Most of my current repos now have a single-step build. It’s not perfect, but I’m very pleased with the results.
In this post, I’ll explain my typical setup, and how I use Make and Docker to get fast and reliable builds.
Content warning: discussion of guns, police violence, and images of armed police.
I live in the UK. We have fairly strict laws around owning guns, and you’re unlikely to encounter a gun except on an army base or in a work of fiction.
As a result, I’m fairly sceptical of guns, and I find the sight of them unnerving. I was on holiday in Berlin recently, where most of the police officers on the street were carrying handguns, and I was jumpy whenever I passed them. The same applies when I’m in airports. I feel like this is a healthy reaction.
On occasion, I do encounter armed police officers in the UK, and it’s never pleasant.
I have no idea whether armed police actually make me safer — I’m not a crime expert, and I don’t know what gun crime statistics are in the UK. But seeing a gun is so unusual, that when I see armed police I feel less safe. Being shot at is not something I worry about in day-to-day life, except when I see an armed police officer. (Although I do know that Police Federation surveys continually show resistance to routine arming by police officers, who I’d expect to know.)
And I’m a cis white man, a group that isn’t usually profiled or targeted by police. Other people probably find it much scarier.
A month or so ago, I was waiting for a train at King’s Cross, when two officers carrying large semiautomatic guns suddenly appeared behind me. They walked straight past me, but I was briefly terrified. Another officer came up to me to explain that this was an “awareness campaign” to “make me feel safer”, even though it had the opposite effect. I politely explained this to the officer, who didn’t want to see my point of view.
On that occasion, I got on my train to Cambridge, and left the guns behind. Unfortunately, it’s followed me home.
git branch --merged master gives us a list of branches “whose tips are reachable from the specified commit” — any branch whose final commit has been merged into master. If your main branch has a different name, use that instead of master.
That gets piped to egrep -v, which excludes any lines which match the pattern. In this case, the pattern filters out branches whose name ends in master or dev. You should adapt this for any long-lived branches in your repo.
Finally, any branches which remain are passed via xargs to git branch --delete, which deletes the branch.
I originally got the command from a Stack Overflow answer, although I tweaked it when I read the documentation, to more closely match my use case.
If you want to see what branches this will delete without committing to it, run everything before the second pipe — not the xargs bit at the end.
The other command I often use is this one:
$ git fetch origin --prune
If a branch has been deleted in the origin remote, and you had a local branch which was tracking it, the local branch gets deleted as well.
For example: suppose you had a branch called new-feature. You push the branch to GitHub, open a pull request, and later the branch gets merged and deleted through the GitHub web interface. When you do your next fetch with --prune, it’ll clean up the local branch new-feature.
Git branches are very cheap — usually a single file that references a commit hash — so deleting branches won’t save disk space or improve performance. I like to keep my repos neat and tidy, and not have a long branch list to scroll through, which is why I do this. If a long branch list doesn’t bother you, then you can ignore these commands.
At work, we use Amazon CloudWatch for logging in our applications. All our logs are sent to CloudWatch, and you can browse them in the AWS Console. The web console is fine for one-off use, but if I want to do in-depth analysis of the log, nothing beats a massive log file. I’m very used to tools like grep, awk and tr, and I’m more productive using those than trying to wrangle a web interface.
So I set out to write a Python script to download all of my CloudWatch logs into a single file. The AWS SDKs give you access to CloudWatch logs, so this seems like it should be possible. There are other tools for doing this (for example, I found awslogs after I was done) — but sometimes it can be instructive to reinvent something from scratch.
In this post, I’ll explain how I wrote this script, starting from nothing and showing how I build it up. It’s also a nice chance to illustrate several libraries I use a lot (boto3, docopt and maya). If you just want the code, skip to the end of the post.
Today, the National Museum of Computing (TNMoC) is celebrating the five-year anniversary of their reboot of the Harwell-Dekatron computer, also known as WITCH.
The Harwell-Dekatron was originally built in Harwell in the 1950s, as part of the British nuclear program. It passed through a number of hands, before finally being decommissioned in 1973. Then it went into storage, until it was recovered by TNMoC in 2009. It moved to the museum, was restored by volunteers, rebooted in 2012, and it continues to run there today. The original news story about the reboot has more detail about the machine’s history, and how it ended up at the museum.
This computer isn’t just a static exhibit, but a working display. If you visit the museum, you’ll often see (and hear!) it running. The WITCH is powered by over 828 Dekatron tubes — a mechanical part that can hold a number from 1 to 10. It looks like a small tube, with an orange light that rotates as it cycles from 1 to 10, so you can see exactly what value it’s holding, and literally “read” the computer’s inner workings. Dekatrons also make a distinctive clackety clackety noise, and together with the visuals, the running machine is quite an experience.
The WITCH wasn’t a fast machine, even by 1950s standards. Rather than doing quick calculations, it was designed to work slowly, but run very reliably for long periods of time. Jack Howlett, Director of the Computer Laboratory at Harwell, once wrote in a report:
It took little power and could be left unattended for long periods; I think the record was over one Christmas-New Year holiday when it was all by itself, with miles of input data on punched tape to keep it happy, for at least ten days and was still ticking away when we came back.
I was once told a fun story about this Christmas run. The operators wanted to check the machine kept running, but without someone having to be in the room. So they left the phone off the hook, hanging next to the WITCH, and they’d dial in to check how it was doing. If they heard the characteristic clackety-clack, they’d know the machine was still running, and they’d rest easy. Silence, they’d know it had stopped.
I can’t remember where I first heard this story, and I have nothing to back it up. But I find the idea delightful — a machine left to run over Christmas, tracked by an analogue phone and a mechanical clack. Such an ingenious way to do remote monitoring.
When I was in college, I did a bit of work in the college theatre as a backstage technician. Among other things, this meant dealing with sound systems, where I was taught an important rule: don’t tap on the microphone. It’s a common cliche, but rarely a good idea.
Tapping creates a sudden, loud noise in the microphone, which can cause damage to the microphone and/or the speaker that plays it back.1 If you want to do a sound check, speak or sing as you’ll be using the mic live. It’s a more realistic test, gives you an opportunity to hear what you’ll really sound like, and is more pleasant for anybody listening.
I was reminded of this tonight when reading the speaker guidelines for Nine Worlds, which gives an entirely different reason not to tap the mic:
Please don’t tap the microphone, as the amplified sudden noise can cause pain to D/deaf2 people present since it will be transmitted directly into their ears.
(In the same vein, you should always use a microphone if one is provided, even if you think you don’t need it. It makes a big difference for anybody with a hearing aid, and for the quality of sound on the recording.)
If you speak at or run events, their guidelines haves lots of good advice. As well as how not to abuse your sound equipment, there are suggestions for things like handling your tech and A/V (multiple layers of backup, arrive well in advance); referring to audience members in a gender-neutral way; and providing appropriate content warnings on your talks. I recommend giving them a read.
It’s only some types of mic/speaker that are susceptible to this damage, but I can never remember the difference, and equipment is expensive enough that I don’t want to risk it. ↩︎
Something else I learnt tonight: there are “small d” and “big D” identities in deaf culture. Based on a quick search, it’s a distinction between the hearing loss, and being in the Deaf community — but deaf people have written about it more detail, and can explain it better than I can. ↩︎
Git is a very common tool in modern development workflows. It’s incredibly powerful, and I use it all the time — I can’t remember the last time I used a version control tool that wasn’t Git — but it’s a bit of a black box. How does it actually work?
For a long time, I’ve only had a vague understand of the Git’s inner workings. I think it’s important to understand my tools, because it makes me more confident and effective, so I wanted to learn how Git works under the hood. To that end, I gave a workshop at PyCon UK 2017 about Git internals. Writing the workshop forced me to really understand what was going on.
The session wasn’t videoed, but I do have my notes and exercises. There were four sections, each focusing on a different Git concept. It was a fairly standard format: I did a bit of live demo to show the new ideas, then people would work through the exercises on their own laptop. I wandered around the room, helping people who were stuck, or answering questions, then we’d come together to discuss the exercise. Repeat. On the day, we took about 2 ½ hours to cover all the material.
If you’re trying to follow along at home, the Git book has a great section on the low-level commands of Git. I made heavy reference to this when I wrote the notes and exercises.
Another week, another disappointing survey that asks “What is your gender? Female/Male.”
This may be old news to people who read my blog, but if not: gender isn’t a binary. There are plenty of people who identify as non-binary or agender or have some other gender identity that doesn’t fit neatly into one of those two buckets. If you need to ask about gender (and really, do you need to know?), you should be looking beyond offering binary choices.
At a minimum, I think a survey should offer choices for folks who don’t fit the typical F/M binary, and folks who don’t want to tell you. In most cases, you don’t absolutely need to know gender, and you should allow people not to tell you.
This is my current favourite set of choices:
Prefer to self-describe (with a free text field)
Prefer not to say
I find the phrase “prefer to self-describe” is less impersonal than “other”, which is often used for the third field. It’s also easier than trying to come up with a cover-all label for “not in female/male”. There’s a bit more work in normalising the free text responses, but I think it’s worth the effort.
I also like having an explicit “prefer not to say” choice, even if it’s not a required question on the survey. It’s good to be absolutely clear that this is an optional question.
This is far from the only way to ask this question — a Google search will turn up lots of advice for asking about gender, and lots of alternative wordings. Use mine, use somebody else’s, or make up your own — just please don’t fall back to “Female/Male”.
When I go to tech conferences, I’m often drawn to the non-technical talks. Talks about diversity, or management, or culture. So when it came to make a proposal for this year’s PyCon UK, I wanted to see if I could write my own non-technical talk.
Talking about diversity and inclusion can be tricky. It’s easy to be well-intentioned, but end up saying something that’s harmful or offensive. But it’s an important topic — the tech industry has systemic problems with inclusion, and recent news shows us how far we still have to go. I chose it for both those reasons — in part because it’s an important topic, and in part to challenge myself by speaking about a topic I hadn’t tackled before.
This is a talk about privilege. It’s about how we, as people of privilege in the tech industry, can do more to build cultures that are genuinely inclusive.
I first gave this talk at PyCon UK 2017. You can read the slides and notes on this page, or download the slides as a PDF. The notes are a rough approximation of what I planned to say, written after the conference finished. My spoken and written voice are quite different, but it gets the general gist across.
If you’d prefer, you can watch the conference video on YouTube:
A constant highlight of PyCon UK is the lighting talks session. A lightning talk is a talk of up to five minutes, on any topic that might be of interest to the PyCon UK audience. There are usually ten talks in an hour-long session, with a bit of time for handover between speakers, and there are four sessions (one per day) during the conference. Videos of past sessions are on YouTube, including from just this Thursday!
Lightning talks are always fun because you get a wide variety of topics in a short space of time — already this year we’ve heard about mutation testing, dynamic tracing, and chocolate brownies! And it’s a great way for somebody who’s never spoken before to get up on stage. The audience is always friendly, five minutes is enough to say something interesting, and you’re talking about a topic you’re enthusiastic about.
In years gone by, you’d sign up for a lightning talk by writing your name on a flipchart: first-come, first-served. The simplicity was great, but it tipped in favour of people who knew the system — it gave you a head-start compared to a new attendee. And if you hemmed and hawed over whether you wanted to speak, all the slots would be filled up before you’d made a decision.
I’m a big fan of the way the talk selection has been balanced out this year. Thanks to the efforts of Owen, Tim and Vince, the conference now has a lottery system instead.