Tiling the plane with Pillow

On a recent yak-shaving exercise, I’ve been playing with Pillow, an imaging library for Python. I’ve been creating some simple graphics: a task for which I usually use PGF or TikZ, but those both require LaTeX. In this case, I didn’t have a pre-existing LaTeX installation, so I took the opportunity to try Pillow, which is just a single pip install.1

Along the way, I had to create a regular tiling with Pillow. In mathematics, a tiling is any arrangement of shapes that completely covers the 2D plane (a flat canvas), without leaving any gaps. A regular tiling is one in which every shape is a regular polygon – that is, a polygon in which every angle is equal, and every side has the same length.

There are just three regular tilings of the plane: with squares, equilateral triangles, and regular hexagons. Here’s what they look like, courtesy of Wikipedia:

In this post, I’ll explain how I reproduced this effect with Pillow. This is a stepping stone for something bigger, which I’ll write about in a separate post.

If you just want the code, it’s all in a script you can download.

Continue reading →

Why I use py.test

A question came up in Slack at work recently: “What’s your favorite/recommended unit test framework [for Python]?” I gave a brief recommendation at the time, but I thought it’d be worth writing up my opinions properly.

In Python, the standard library has a module for writing unit tests – the aptly-named unittest – but I tend to eschew this in favour of py.test. There are a few reasons I like py.test: my tests tends to be cleaner, have less boilerplate, and I get better test results. If you aren’t using py.test already, maybe I can persuade you to start.

I’m assuming you’re already somewhat familiar with the idea of unit testing. If not, I’d recommend Ned Batchelder’s talk Getting Started Testing and Eevee’s post Testing, for people who hate testing.

So, why do I prefer py.test?

Continue reading →

A shell alias for tallying data

Here’s a tiny shell alias that I find useful when going through data on the command line.

Suppose I have a big collection of data, and I’d like to know which items occur most frequently: I want to build a tally. I have this shell alias defined that lets me build such a tally:

alias tally='sort | uniq -c | sort'

Here’s an example of the sort of output returned by piping to tally, a nice tabular format:

$ cat colors.txt | tally
   8 yellow
  45 red
  68 green
 100 blue

(Note: on some Linuxes, sort uses alphabetical sorting, so you’ll want to replace the second sort with sort -h to get a tally that sorts numerically.)

If you want to get the most common items from a tally, that’s just another pipe: send the output from tally to tail -n 5, replacing 5 with the number of most common items you’d like to see.

Another example: let’s see the five most common HTTP status codes in my Apache log. I read the entire log, use awk to extract the status code, and then pass the output to tally:

$ cat access.log | awk '{print $9}' | tally | tail -n5
  15804 302
  31955 204
  39115 301
  88825 404
 952709 200

This is one of the simplest aliases in my shell config, but I still like having it around. Anything that saves me a bit of typing and thinking is usually worthwhile.

My travelling tech bag

I have a small bag I carry whenever I’m travelling and taking my laptop or phone with me. It includes all the adapters and power cables I usually expect to need. The idea is that I could pick it up at any time, and have it be ready to go. I don’t have to faff around finding parts if I’m in a hurry.

I got a few questions about this at PyCon last week, so I thought I’d make a quick list of what it currently contains. Not everybody needs everything in this bag, but it’s worth thinking about how much (or little!) you could carry and always have what you need.

This is what my bag looks like, straight after PyCon:

A photograph of my tech bag. A rectangular pouch with two compartments, stuffed with electronics equipment.

Continue reading →

aspell, a command-line spell checker

At this month’s WriteTheDocs London, there was a discussion of “docs-as-code”. This is the idea of using plain-text formats for your documentation, and storing it alongside your code — as opposed to using a wiki or another proprietary format. This allows you to use the same tools for code and for docs: version control, code review, text editors, and so on. By making it easier to move between the two, it’s more likely that docs will be written and updated with code changes.

But one problem is that text editors for programmers tend to disable spellcheck. This is sensible for code: program code bears little resemblance to prose, and the spellcheck would be too noisy to be helpful. But what about writing prose? Where are the red and green squiggles to warn you of spelling mistakes?

To plug the gap, I’m a fan of the command-line spellchecker aspell.

Continue reading →

Silence is golden

As I write this, it’s the last day of PyCon UK. The air is buzzing with the sound of sprints and productivity. I’ll write a blog post about everything that happened at PyCon later (spoiler: I’ve had a great time), but right now I’d like to write about one specific feature – an idea I’d love to see at every conference. I’ve already talked about live captioning – now let’s talk about quiet rooms.

I’m an introvert. Don’t get me wrong: I enjoy socialising at conferences and meetups. I get to meet new people, or put faces to names I’ve seen online. Everybody I’ve met this week has been lovely and nice, but there’s still a limit to how much socialising I can do. Being in social situations is quite draining, and a full day of conference is more than I can manage in one go. At some point, I need to step back and recharge.

I don’t think this is atypical in the tech/geek communities.

So I’ve been incredibly grateful that the conference provides a quiet room. It’s exactly what the name suggests – a space set aside for quiet working and sitting. Whenever I’ve been feeling a bit overwhelmed by the bustle of the main conference, I can step into the quiet room. Some clear head space helps me through the day.

PyCon was held in Cardiff City Hall, and the designated quiet room was the Council Chamber. It’s a really nice and large space:

The council chamber at Cardiff City Hall

If there hadn’t been a quiet room, I’d have worn out much faster and probably been miserable towards the end of the conference. It made a big difference to my experience. I think it’s a great feature, and I’ll be looking for it at the next conference I attend.

Live captioning at conferences

This weekend, I’ve been attending PyCon UK in Cardiff. This is my first time at a PyCon (or indeed, at any tech conference), and one nice surprise has been the live captioning of the talks.

At the front of the main room, there are two speech-to-text reporters transcribing the talk in real-time. Their transcription is shown as live, scrolling text on several large screens throughout the room, and shows up within seconds of the speaker finishing a word.

Here’s what one of those screens looks like:

Photo by @drvinceknight on Twitter. Used with permission.

I’m an able-bodied person. I appreciate the potential value of live captioning for people with hearing difficulties – but my hearing is fine. I wasn’t expecting to use the transcription.

Turns out – live captioning is really useful, even if you can already hear what the speaker is saying!

Maintaining complete focus for a long time is remarkably hard. Inevitably, my focus slips, and I miss something the speaker says – a momentary distraction, my attention wanders, or somebody coughs at the wrong moment. Without the transcript, I have to fill in the blank myself, and there’s a few seconds of confusion before I get back into the talk. With the transcript, I can see what I missed. I can jump straight back in, without losing my place. I’ve come to rely on the transcript, and I miss it when I’m in talks without it. (Unfortunately, live captioning is only in one of the three rooms running talks.)

And I’m sure I wasn’t the only person who found them helpful. I saw and heard comments from lots of other people about the value of the live captioning, and it was great for them to get a call-out in Saturday’s opening remarks. This might be pitched as an accessibility feature, but it can help everybody.

If you’re running a conference (tech or otherwise), I would strongly recommend providing this service.

Python snippet: dealing with query strings in URLs

I spend a lot of time dealing with URLs: in particular, with URL query strings. The query string is the set of key-value pairs that comes after the question mark in a URL. For example:


Typically I want to do one of two things: get the value(s) associated with a particular key, or create a new URL with a different key-value pair.

This is possible with the Python standard library’s urllib.parse module, but it’s a bit fiddly and requires chaining several functions together. Since I do this fairly often, I have a pair of helper functions that I copy-and-paste into new projects when I need to do this. And since it’s fairly generic, I thought it might be worth sharing more widely.

Continue reading →

Python snippet: Is a URL from a Tumblr post?

I’ve been writing some code recently that takes a URL, and performs some special actions if that URL is a Tumblr post. The problem is working out whether a given URL points to Tumblr.

Most Tumblrs use a consistent naming scheme: username.tumblr.com, so I can detect them with a regular expression. But some Tumblrs use custom URLs, and mask their underlying platform: for example, http://travelingcolors.net or http://wordstuck.co.vu. Unfortunately, I encounter enough of these that I can’t just hard-code them, and I really should handle them properly.

So how can I know if an arbitrary URL belongs to Tumblr?

I’ve had to do this a couple of times now, so I thought it was worth writing up what to do – partly for my future reference, partly in case anybody else finds it useful.

In the HTTP headers on a Tumblr page, there are a couple of “X-Tumblr” headers. These are custom headers, defined by Tumblr – they aren’t part of the official HTTP spec. They aren’t documented anywhere, but it’s clear who’s sending them, and I’d be quite surprised to see another site send them. For my purposes, this is a sufficiently reliable indicator.

So this is the function I use to detect Tumblr URLs:

    from urllib.parse import urlparse
except ImportError:  # Python 2
    from urlparse import urlparse

import requests

def is_tumblr_url(url):
    if urlparse(url).netloc.endswith('.tumblr.com'):
        return True
        req = requests.head(url)
        return any(h.startswith('X-Tumblr') for h in req.headers)

It’s by no means perfect, but it’s a step-up from basic string matching, and accurate and fast enough that I can usually get by.

Python snippets: Cleaning up empty/nearly empty directories

Last month, I wrote about some tools I’d been using to clear disk space on my Mac. I’ve been continuing to clean up my mess of files and folders as I try to simplify my hard drive, and there are two new scripts I’ve been using to help me. Neither is particularly complicated, but I thought they were worth writing up properly.

Depending on how messy your disk is, these may or may not be useful to you – but they’ve saved a lot of time for me.

Of course, you should always be very careful of code that deletes or rearranges files on your behalf, and make sure you have good backups before you start.

Continue reading →

← Older PostsNewer Posts →