At work, we use Amazon CloudWatch for logging in our applications. All our logs are sent to CloudWatch, and you can browse them in the AWS Console. The web console is fine for one-off use, but if I want to do in-depth analysis of the log, nothing beats a massive log file. I’m very used to tools like grep, awk and tr, and I’m more productive using those than trying to wrangle a web interface.
So I set out to write a Python script to download all of my CloudWatch logs into a single file. The AWS SDKs give you access to CloudWatch logs, so this seems like it should be possible. There are other tools for doing this (for example, I found awslogs after I was done) — but sometimes it can be instructive to reinvent something from scratch.
In this post, I’ll explain how I wrote this script, starting from nothing and showing how I build it up. It’s also a nice chance to illustrate several libraries I use a lot (boto3, docopt and maya). If you just want the code, skip to the end of the post.
Today, the National Museum of Computing (TNMoC) is celebrating the five-year anniversary of their reboot of the Harwell-Dekatron computer, also known as WITCH.
The Harwell-Dekatron was originally built in Harwell in the 1950s, as part of the British nuclear program. It passed through a number of hands, before finally being decommissioned in 1973. Then it went into storage, until it was recovered by TNMoC in 2009. It moved to the museum, was restored by volunteers, rebooted in 2012, and it continues to run there today. The original news story about the reboot has more detail about the machine’s history, and how it ended up at the museum.
This computer isn’t just a static exhibit, but a working display. If you visit the museum, you’ll often see (and hear!) it running. The WITCH is powered by over 828 Dekatron tubes — a mechanical part that can hold a number from 1 to 10. It looks like a small tube, with an orange light that rotates as it cycles from 1 to 10, so you can see exactly what value it’s holding, and literally “read” the computer’s inner workings. Dekatrons also make a distinctive clackety clackety noise, and together with the visuals, the running machine is quite an experience.
The WITCH wasn’t a fast machine, even by 1950s standards. Rather than doing quick calculations, it was designed to work slowly, but run very reliably for long periods of time. Jack Howlett, Director of the Computer Laboratory at Harwell, once wrote in a report:
It took little power and could be left unattended for long periods; I think the record was over one Christmas-New Year holiday when it was all by itself, with miles of input data on punched tape to keep it happy, for at least ten days and was still ticking away when we came back.
I was once told a fun story about this Christmas run. The operators wanted to check the machine kept running, but without someone having to be in the room. So they left the phone off the hook, hanging next to the WITCH, and they’d dial in to check how it was doing. If they heard the characteristic clackety-clack, they’d know the machine was still running, and they’d rest easy. Silence, they’d know it had stopped.
I can’t remember where I first heard this story, and I have nothing to back it up. But I find the idea delightful — a machine left to run over Christmas, tracked by an analogue phone and a mechanical clack. Such an ingenious way to do remote monitoring.
When I was in college, I did a bit of work in the college theatre as a backstage technician. Among other things, this meant dealing with sound systems, where I was taught an important rule: don’t tap on the microphone. It’s a common cliche, but rarely a good idea.
Tapping creates a sudden, loud noise in the microphone, which can cause damage to the microphone and/or the speaker that plays it back.1 If you want to do a sound check, speak or sing as you’ll be using the mic live. It’s a more realistic test, gives you an opportunity to hear what you’ll really sound like, and is more pleasant for anybody listening.
I was reminded of this tonight when reading the speaker guidelines for Nine Worlds, which gives an entirely different reason not to tap the mic:
Please don’t tap the microphone, as the amplified sudden noise can cause pain to D/deaf2 people present since it will be transmitted directly into their ears.
(In the same vein, you should always use a microphone if one is provided, even if you think you don’t need it. It makes a big difference for anybody with a hearing aid, and for the quality of sound on the recording.)
If you speak at or run events, their guidelines haves lots of good advice. As well as how not to abuse your sound equipment, there are suggestions for things like handling your tech and A/V (multiple layers of backup, arrive well in advance); referring to audience members in a gender-neutral way; and providing appropriate content warnings on your talks. I recommend giving them a read.
It’s only some types of mic/speaker that are susceptible to this damage, but I can never remember the difference, and equipment is expensive enough that I don’t want to risk it. ↩︎
Something else I learnt tonight: there are “small d” and “big D” identities in deaf culture. Based on a quick search, it’s a distinction between the hearing loss, and being in the Deaf community — but deaf people have written about it more detail, and can explain it better than I can. ↩︎
Git is a very common tool in modern development workflows. It’s incredibly powerful, and I use it all the time — I can’t remember the last time I used a version control tool that wasn’t Git — but it’s a bit of a black box. How does it actually work?
For a long time, I’ve only had a vague understand of the Git’s inner workings. I think it’s important to understand my tools, because it makes me more confident and effective, so I wanted to learn how Git works under the hood. To that end, I gave a workshop at PyCon UK 2017 about Git internals. Writing the workshop forced me to really understand what was going on.
The session wasn’t videoed, but I do have my notes and exercises. There were four sections, each focusing on a different Git concept. It was a fairly standard format: I did a bit of live demo to show the new ideas, then people would work through the exercises on their own laptop. I wandered around the room, helping people who were stuck, or answering questions, then we’d come together to discuss the exercise. Repeat. On the day, we took about 2 ½ hours to cover all the material.
If you’re trying to follow along at home, the Git book has a great section on the low-level commands of Git. I made heavy reference to this when I wrote the notes and exercises.
Another week, another disappointing survey that asks “What is your gender? Female/Male.”
This may be old news to people who read my blog, but if not: gender isn’t a binary. There are plenty of people who identify as non-binary or agender or have some other gender identity that doesn’t fit neatly into one of those two buckets. If you need to ask about gender (and really, do you need to know?), you should be looking beyond offering binary choices.
At a minimum, I think a survey should offer choices for folks who don’t fit the typical F/M binary, and folks who don’t want to tell you. In most cases, you don’t absolutely need to know gender, and you should allow people not to tell you.
This is my current favourite set of choices:
Prefer to self-describe (with a free text field)
Prefer not to say
I find the phrase “prefer to self-describe” is less impersonal than “other”, which is often used for the third field. It’s also easier than trying to come up with a cover-all label for “not in female/male”. There’s a bit more work in normalising the free text responses, but I think it’s worth the effort.
I also like having an explicit “prefer not to say” choice, even if it’s not a required question on the survey. It’s good to be absolutely clear that this is an optional question.
This is far from the only way to ask this question — a Google search will turn up lots of advice for asking about gender, and lots of alternative wordings. Use mine, use somebody else’s, or make up your own — just please don’t fall back to “Female/Male”.
When I go to tech conferences, I’m often drawn to the non-technical talks. Talks about diversity, or management, or culture. So when it came to make a proposal for this year’s PyCon UK, I wanted to see if I could write my own non-technical talk.
Talking about diversity and inclusion can be tricky. It’s easy to be well-intentioned, but end up saying something that’s harmful or offensive. But it’s an important topic — the tech industry has systemic problems with inclusion, and recent news shows us how far we still have to go. I chose it for both those reasons — in part because it’s an important topic, and in part to challenge myself by speaking about a topic I hadn’t tackled before.
This is a talk about privilege. It’s about how we, as people of privilege in the tech industry, can do more to build cultures that are genuinely inclusive.
I first gave this talk at PyCon UK 2017. You can read the slides and notes on this page, or download the slides as a PDF. The notes are a rough approximation of what I planned to say, written after the conference finished. My spoken and written voice are quite different, but it gets the general gist across.
If you’d prefer, you can watch the conference video on YouTube:
A constant highlight of PyCon UK is the lighting talks session. A lightning talk is a talk of up to five minutes, on any topic that might be of interest to the PyCon UK audience. There are usually ten talks in an hour-long session, with a bit of time for handover between speakers, and there are four sessions (one per day) during the conference. Videos of past sessions are on YouTube, including from just this Thursday!
Lightning talks are always fun because you get a wide variety of topics in a short space of time — already this year we’ve heard about mutation testing, dynamic tracing, and chocolate brownies! And it’s a great way for somebody who’s never spoken before to get up on stage. The audience is always friendly, five minutes is enough to say something interesting, and you’re talking about a topic you’re enthusiastic about.
In years gone by, you’d sign up for a lightning talk by writing your name on a flipchart: first-come, first-served. The simplicity was great, but it tipped in favour of people who knew the system — it gave you a head-start compared to a new attendee. And if you hemmed and hawed over whether you wanted to speak, all the slots would be filled up before you’d made a decision.
I’m a big fan of the way the talk selection has been balanced out this year. Thanks to the efforts of Owen, Tim and Vince, the conference now has a lottery system instead.
Every so often, I want to use a tweet in some slides I’m making (I have three in my PyCon UK slides for Friday). If I’m doing this, I want to make it clear that the text I’m using is a tweet, not just a generic quote. Tweets have quite a distinct visual style, and give a very clear way to find the original author.
Twitter gives you an “Embed Tweet” button for using on web pages, but I’m not sure if you can use this in Keynote or PowerPoint — and given it has to make a network call to display the tweet properly, do you want to rely on it in a presentation?
Screenshots are better, but still not ideal — you lose the text, so your presentation becomes less accessible. You can also get fuzzy text if you have to resize the tweet or took a small screenshot.
Far better to draw it using your app’s drawing tools as a static image, which is exactly what I do in Keynote. Then the text is directly embedded (more accessible), and text always looks nice and crisp. This is what the effect looks like, with a single tweet per slide (more than one gets distracting):
It’s on the small side for text on a slide, but I’ve found it to work well if deployed sparingly.
If you’d like to use these templates, I’ve uploaded the Keynote file that has both these slides, and templates for creating more. It will probably work in PowerPoint, although I don’t have a copy of PowerPoint to test with.
And depending on the API, I may want even more checks or logging. For example, APIs that always return an HTTP 200 OK, but embedded the real response code in a JSON response. Or maybe I want to log the URL I requested.
If I’m making lots of calls to the same API, repeating this code gets quite tedious. Previously I would have wrapped requests.get in a helper function, but that relies on me remembering to use the wrapper.
It turns out there’s a better way — today I learnt that requests has a hook mechanism that allows you to provide functions that are called after every response. In this post, I’ll show you some simple examples of hooks that I’m already using to clean up my code.