A coworker was reviewing one of my pull requests yesterday, when she pointed at the screen and asked, “Is that meant to be there?” Somehow, an SSH private key had appeared in the diff!
The key in question gave push access to our main repository (it was configured as a deploy key with write access). It was only for that repository, and we’d be able to restore everything from local clones; it would just be a massive faff.
I rushed back to my desk to revoke the key and work out what had happened, and I was pleasantly surprised by an email from GitHub (emphasis mine):
We noticed that a valid SSH private key of yours was committed to a public GitHub repository. This key is configured as a deploy key for the wellcometrust/platform repository. Publicly disclosing a valid SSH private key would allow other people to interact with this repository, potentially altering data.
As a precautionary measure, we have unverified the SSH key. You should should generate a new SSH key and add it to the repository.
Comparing timestamps, this email was sent almost as soon as the commit landed. Attempting to push to the repo using the leaked SSH key would fail. Even if we’d missed the diff, we were still protected against malicious commits.
I’ve heard stories of Amazon scanning public GitHub repos for leaked AWS credentials, and proactively revoking them, but I never thought something like that would happen to me.
Thanks for protecting us, GitHub!
How did this happen?
So how did an SSH private key end up in the commit history?
Nobody had checked it in, accidentally or on purpose. None of us even had a local copy!
No, our SSH key was leaked by one of our build jobs in Travis CI.
To keep our code tidy, we run a number of autoformatters against our repository (scalafmt, terraform fmt and autoflake). On pull requests, we have a Travis job that runs the autoformatting, puts the changes in a new commit, and pushes the changes to your PR.
To allow Travis to push changes, we give it an SSH private key in an encrypted zip file. The corresponding public key is configured as a deploy key on our repository. Travis unencrypts and unzips the file at runtime, and loads the private key into its Git config.
I’d been tweaking our GitHub deploy keys to manage them with Terraform. Previously the encrypted zip was unpacked into a dedicated (and gitignored) directory; now the files are unpacked into the repo root. Which is fine… until the autoformat script comes along, and tries to add every change to a new commit. It saw the private key as a new, untracked file, and included it in the commit.
How do I stop this happening again?
Aside from rotating out the compromised key, I’ll be making some changes to avoid a repeat of this exact scenario:
Add the private key file to .gitignore. I should have done this already, I was just lazy.
Use git add --update instead of git add --all in our autoformat script. That should stop the script adding random files it finds lying around the repo (which can also include AWS credentials).
The autoformat script gets a list of changed files before it decides whether to commit – if the list is empty, nothing has changed and it can exit early. It knows what sort of files should have changed (e.g. only files ending in .py for the Python formatter), so I’ll change it to error if it spots an unrecognised file.
This isn’t the first time I’ve made a daft security mistake, and it won’t be the last – and the next one will probably be something completely different. Best I can hope is that I’ll be similarly lucky, and whatever it is won’t be an expensive mistake.
I moved house recently, and this evening I was setting up my iMac. My particular machine is nearly five years old, and like every iMac of the last decade, all the USB ports are on the back of the machine.
It’s awkward to plug stuff into those rear ports, because I can’t see them. I fumble to get something in the right place, and only then do I find out if I even guessed the correct rotation. (For all having to buy dongles for USB-C is annoying, the symmetric connector is a really nice change.)
Sometime since my last move, I bought a USB extension cable and plugged it into one of those ports – and ran the other end round the front of the computer. Suddenly I can see the port, get the rotation right on the first try, and I’m not leaning over my desk to plug stuff in. If you’re an iMac owner who ever uses the USB ports, I really recommend it. They’re fairly cheap on Amazon (example, affiliate link).
In an ideal world, the next iMac design will feature easily accessible ports – until then, this is a good workaround.
When I’m writing scripts, I often have some tabular data that I need to present. This data might show the number of website hits per day, or which pages had the most errors. Here’s an example of the sort of tabular data I mean:
I want to print it in a way that’s easy for me to read, and makes the trends stand out. It’s hard to get a sense of the overall picture without reading the individual numbers – I’d much rather have a bar chart.
If I was being fancy, I could use matplotlib and draw a graphical chart – but if I’m running a shell script in the terminal, it’s not always convenient to display images. I’d rather print something I can view in my terminal – an ASCII bar chart.
There are probably libraries that can do this for you, but I found it simpler to write my own snippet to draw bar charts.
Last week was Google I/O, and there was a lot of discussion around one of their keynote demos: Google Duplex. This is a service that acts like a human, and makes phone calls on your behalf. You can watch the demos on YouTube – one has Duplex booking a haircut, another trying to book a table at a restaurant.
From a technical perspective, I think it’s very impressive. I still have memories of primitive speech-to-text voices, so the vocal quality of that demo and the understanding of the other person and the near-instant responses feels very futuristic.
But I’ve heard people dismissing it as a toy for rich lazy people, and that feels a bit ableist to me.
Lots of people have trouble with phone calls, for a variety of reasons. Maybe hearing is difficult. Perhaps they can’t speak, or they have speech difficulties or an accent that make it hard to be understood. Or they have anxiety talking to strangers on the phone, or waiting on hold uses energy they don’t have.
Giving those people a way to use phone trees/voice-only interfaces? That could be a great step forward for accessibility.
Calling it “lazy” is like shaming somebody for not using the stairs, or for buying pre-cut fruit and veg. You might not need it, but maybe they do.
I’m not somebody who needs this, but I feel icky seeing people so quick to pass judgement.
When you set a password for your Twitter account, we use technology that masks it so no one at the company can see it. We recently identified a bug that stored passwords unmasked in an internal log. We have fixed the bug, and our investigation shows no indication of breach or misuse by anyone.
Quite by chance, I spent yesterday fixing a similar bug. I was a bit careless when using the subprocess module, and leaked some AWS credentials into a public CI log.
I often find myself needing to edit or inspect the contents of a text file stored in S3.
For example, at work we have a Terraform variables file kept in a private S3 bucket. This contains configuration that we don’t want to put in a public repository – passwords, API credentials, usernames, and so on. If I want to add a new secret to this file, I need to download the existing file, make an edit, then re-upload the file under the same key. It isn’t hard, but it’s moderately tedious to do these steps manually.
Any time you have a repetitive and tedious task, it’s worth trying to find a way to automate it. To that end, I have a function in my shell config that simplifies the process of editing an text file in S3. The function is written for fish, but the concept could be adapted for any shell. It opens the file in my preferred text editor, which is TextMate (invoked with mate).
set s3key $argvset localname (basename $argv)pushd(mktemp -d)# Download the object from S3. Although we're in a temporary# directory, give it a nice name for the sake of the editor.
aws s3 cp "$s3key""$localname"
cp "$localname""$localname.copy"# Open the file in an editor. The '-w' flag to 'mate' means# "wait until the file has closed before continuing".
mate -w "$localname"# Is the file different to the original version? If so, save# a new copy to S3.
cmp "$localname""$localname.copy" >/dev/null
aws s3 cp "$localname""$s3key"
I call it from a shell by passing it an s3:// URI. For example:
$ s3mate s3://private-bucket/terraform.tfvars
This download the object terraform.tfvars from private-bucket into a temporary directory, and opens it in TextMate. I can edit the file as much as I like, then I save and close it. Once the file is closed, it checks to see if I’ve changed anything with cmp(1). If I’ve made changes, it uploads a new copy of the file to the original key.
If lots of people were editing this file at once, this approach wouldn’t be safe – I could download and start editing, and somebody else could change the file at the same time. When I uploaded my new version, I’d delete their changes. There’s no safe way to protect against this in S3 – it has no support for transactional updates. Even if you checked the object in S3 hadn’t changed before uploading, it could still change between the check and the upload.
In practice, it’s rare for me to work on a file that has multiple editors, so this isn’t an issue for me – but it is worth noting.
Once I had this function, it was only a small tweak to get a version that inspects files, but doesn’t edit them. Viz:
set s3key $argvset localname (basename $argv)pushd(mktemp -d)
aws s3 cp "$s3key""$localname"
This is a talk I gave today for students on Bournemouth University’s Cyber Security Management course. It’s loosely inspired by a talk about privilege and inclusion I gave at PyCon UK last year, focusing on a specific area – online harassment.
The idea is to discuss harassment, and how the design of online services can increase (or decrease) the risk to users. A common mantra is “imagine how an abusive ex will use your service” – this talk is the expanded version of that.
Here’s a brief outline:
What does online harassment look like? With specific examples: harassment, bullying, doxing, threats, and so on. Not everyone faces harassment to the same degree (or at all!), so I wanted to illustrate the sort of risks a user might face.
Threat models: why some groups are more at risk, and the sort of people we should worry about. The abusive ex is an important risk to consider, but who else?
What are some possible good practices? How can service operators reduce the risk to their users? Reviewing some common suggestions – things like blocking, shadow bans, restricting anonymity – what works and what doesn’t.
The aim isn’t to be a comprehensive resource, but to get students thinking about these risks. Harassment is a constantly moving target, and it’s better to anticipate them before they happen.
You can read the slides and notes on this page, or download the slides as a PDF. The notes are my lightly edited thoughts about what I was going to say with each slide – but they may not be exactly what I said on the day!
(Caveat: I didn’t quite finish writing up all the notes before the lecture. The PDF slides are the most up-to-date, and I’ll try to go back and update the inline notes soon.)
Content warning: this talk includes discussion of online harassment, misogyny, racism, suicide, domestic abuse, police violence, sexual violence and assault, rape threats and death threats.
A few months back, I tried making a change to the way I handle Instapaper: I added a 24-hour limit to my queue. If I haven’t read something within a day of saving it, I delete it without reading it.
I made the same rule for RSS, Twitter, GitHub issues, and so on. Dealt with in a day, or not at all.
Dealing with something could mean several things. It might be replying to an email, reading an Instapaper item, or it could mean moving it to my todo list. Once something goes on my todo list, it gets prioritised against everything else I have to do. Reading something important becomes the same as doing something important.
As part of this change, I started redirecting as many notifications as possible to email. I can work through all my messages in one go, not poke around in a dozen different apps and inboxes.
I’ve eliminated a (admittedly self-imposed) source of stress – I don’t have an ever-growing backlog to deal with or worry about. Most of this stuff just isn’t that important, and keeping it in check helps me focus on the messages I actually care about.
Overall, I’m very pleased with this change, and I’ll be keeping it.
On Tuesday, I ran my workshop A Plumber’s Guide to Git for the Cambridge Python User Group. I’ve also run it at PyCon UK and in my workplace. On all three occasions, it’s been very popular and I’ve heard people find it useful – on Tuesday, we actually ran out of space in the room! In an attempt to make it more accessible, I’ve written up the entire workshop and posted it on my site.
The aim of the workshop is to understand of the underlying concepts of Git. We learn exactly what happens in a typical Git workflow (add, branch, commit, and so on). We look inside the .git directory, and explain what Git stores internally to make those workflows happen.
If that sounds interesting to you, start reading the introduction, and continue from there. I’ve uploaded everything I use in the workshop – my notes, whiteboard sketches, and exercises. Typically it takes about two hours to complete. Enjoy!