# Falsehoods programmers believe about Unix time

With apologies to Patrick McKenzie.

Danny was asking us about our favourite facts about Unix time in the Wellcome Slack yesterday, and I was reminded of the way that it behaves in some completely counter-intuitive ways.

These three facts all seem eminently sensible and reasonable, right?

1. Unix time is the number of seconds since 1 January 1970 00:00:00 UTC
2. If I wait exactly one second, Unix time advances by exactly one second
3. Unix time can never go backwards

False, false, false.

But it’s unsatisfying to say “this is false” without explaining why, so I’ll explain that below. If you’d like to think about it first and make your own guess, don’t scroll past the picture of the clock!

All three of these falsehoods have the same underlying cause: leap seconds. If you’re unfamiliar with leap seconds, here’s a brief primer:

There are two factors that make up UTC:

• International Atomic Time, which is an average of hundreds of atomic clocks spread around the globe. We can measure a second from the electromagnetic properties of an atom, and it’s the most accurate measurement of time known to science.

• Universal Time, which is based on the Earth’s rotation about its own axis. One complete rotation is one day.

Problem is, these two numbers don’t always match. The Earth’s rotation isn’t consistent – it’s gradually slowing down, so days in Universal Time are getting longer. Atomic clocks, on the other hand, are fiendishly accurate, and consistent for millions of years.

When the two times drift apart, a leap second is added or removed to UTC to bring them back together. Since 1972, the IERS (who manage this stuff) have inserted an extra 27 leap seconds. The result is a UTC day with 86,401 seconds (one extra), or 86,399 (one missing) – both of which mess with a fundamental assumption of Unix time.

Unix time assumes that each day is exactly 86,400 seconds long (60 × 60 × 24 = 86,400), leap seconds be damned. If there’s a leap second in a day, Unix time either repeats or omits a second as appropriate to make them match. As of 2019, the extra 27 leap seconds are missing.

And so our falsehoods go as follows:

• Unix time is the number of seconds since 1 January 1970 00:00:00 UTC, minus leap seconds.

• If I wait exactly one second, Unix time advances by exactly one second, unless a leap second has been removed.

So far, there’s never been a leap second removed in practice (and the Earth’s slowing rotation means it’s unlikely), but if it ever happened, it would mean the UTC day is one second shorter. The last UTC second (23:59:59) is dropped.

Each Unix day has the same number of seconds, so when the next day starts, it skips ahead by one. The final Unix second of the shorter day never gets allocated to a UTC timestamp. Here’s what that would look like, in quarter-second increments:

If you start at 23:59:58:00 UTC and wait one second, the Unix time advances by two seconds, and the Unix timestamp 101 never gets assigned.

• Unix time can never go backwards, unless a leap second has been added.

This one has happened in practice – 27 times at time of writing. The UTC day gets an extra second added to the end, 23:59:60. Each Unix day has the same number of seconds, so it can’t just add an extra second – instead, it repeats the Unix timestamps for the last second of the day. Here’s what that would look like, in quarter-second increments:

If you start at 23:59:60.50 and wait half a second, the Unix time goes back by half a second, and the Unix timestamp 101 matches two UTC seconds.

And these probably aren’t even the only weirdnesses of Unix time – they’re just the ones I half-remembered yesterday, enough to check a few details and write a blog post about.

Time is straaaaaange.

# Creating a locking service in a Scala type class

A few weeks ago, Robert (one of my colleagues at Wellcome) and I wrote some code to implement locking. I’m quite pleased with the code we wrote, and the way we do all the tricky logic in a type class. It uses functional programming, type classes, and the Cats library.

I’m going to walk through the code in the post, but please don’t be intimidated if it seems complicated. It took us both a week to write, and even longer to get right!

I’m not expecting many people to use this directly. You can copy/paste it into your project, but unless you have a similar use case to us, it won’t be useful to you. Instead, I hope you get a better understanding of how type classes work, how they can be useful, and the value of sans-IO implementations.

## The problem

Robert and I are part of a team building a storage service, which will eventually be Wellcome’s permanent storage for digital records. That includes archives, images, photographs, and much more.

We’re saving files to an Amazon S3 bucket1, but Amazon doesn’t have a way to lock around writes to S3. If more than one process writes to the same location at the same time, there’s no guarantee which will win!

Our pipeline features lots of parallel workers – Docker containers running in ECS, and each container running multiple threads. We want to lock around writes to S3, so that only a single process can write to a given S3 location at a time. We already verify files after they’ve been written, and locking gives an extra guarantee that a rogue process can’t corrupt the archive. Because S3 doesn’t provide those locks for us, we have to manage them ourselves.

This is one use case – there are several other places where we need our own locking. We wanted to build one locking implementation that we could use in lots of places.

## The idea

We already had an existing locking service that used DynamoDB as a backend. It creates locks by writing a row for each lock, and doing a conditional update “only store this row if there isn’t already a row with this lock ID”. If the conditional updated failed, we’d know somebody else was holding the lock.

This code worked fine, but it was closely tied to DynamoDB, and that caused issues.

It was slow and fiddly to test – you needed to set up a dummy DynamoDB instance – and if you were calling the locking service, you needed that test setup as well. It was also closely tied to the DynamoDB APIs, so we couldn’t easily extend or modify it to work with a different backend (for example, MySQL).

We wanted to try writing a new locking service that wasn’t tied to DynamoDB. We’d separate out the locking logic and the database backend, and write something that was easy to extend or modify.

This is the API in the original service which were were trying to replicate:

``````lockingService.withLocks(Set("1", "2", "3")) {
// do some stuff
}
``````

The idea of doing it in a type class (so it wasn’t tied to a particular database implementation) isn’t new.

I first came across this idea when working on hyper-h2, an HTTP/2 protocol stack for Python that’s purely in-memory. It only operates on bytes, and doesn’t have opinions about I/O or networking, so it can be reused in a variety of contexts. hyper-h2 is part of a wider pattern of sans-IO network protocol libraries, and many of the same benefits apply here.

## Managing individual locks

First we need to be able to manage a lock around a single resource. We assume the resource has some sort of identifier, which we can use to distinguish locks.

We might write something like this (here, a dao is a data access object):

``````trait LockDao[Ident] {
def lock(id: Ident)
def unlock(id: Ident)
}
``````

This is a generic trait, which manages acquiring and releasing a single lock. It has to decide if/when we can perform each of those operations.

We can create implementations with different backends that all inherit this trait, and which have different rules for managing locks. A few ideas:

• A DynamoDB-backed LockDao for use in production applications
• An in-memory LockDao for use in tests
• A LockDao that expires locks after a certain period if not explicitly unlocked

The type of the lock identifier is a type parameter, `Ident`. An identifier might be a string, or a number, or a UUID, or something else – we don’t have to decide here.

Sometimes we need to acquire more than one lock at once, which needs multiple calls to `lock()` – and then the caller has to remember which locks they’ve acquired to release them. To make it simpler for the caller, we’ve added a second parameter – a context ID – to track which process owns a given lock. A single call to `unlock()` releases all the locks owned by a process.

Here’s what that trait looks like:

``````trait LockDao[Ident, ContextId] {
def lock(id: Ident, contextId: ContextId)
def unlock(contextId: ContextId)
}
``````

As before, the context ID could be any type, so we’ve made it a type parameter, `ContextId`.

Now let’s think about what these methods should return. We need to tell the caller whether the lock/unlock succeeded.

We probably want some context, especially if something goes wrong – so more than a simple boolean. We could use a `Try` or a `Future`, but that doesn’t feel quite right – we expect lock failures sometimes, and it’d be nice to type the errors beyond just `Throwable`.

Eventually we settled upon using an `Either`, with case classes for lock/unlock failures that include some context for the operation in question, and a Throwable that explains why the operation failed:

``````trait LockDao[Ident, ContextId] {
type LockResult = Either[LockFailure[Ident], Lock[Ident, ContextId]]
type UnlockResult = Either[UnlockFailure[ContextId], Unit]

def lock(id: Ident, contextId: ContextId): LockResult
def unlock(contextId: ContextId): UnlockResult
}

trait Lock[Ident, ContextId] {
val id: Ident
val contextId: ContextId
}

case class LockFailure[Ident](id: Ident, e: Throwable)

case class UnlockFailure[ContextId](contextId: ContextId, e: Throwable)
``````

There’s also a generic `Lock` trait, which holds an Ident and a ContextId. Implementations can return just those two values, or extra data if it’s appropriate. (For example, we have an expiring lock that tells you when the lock is due to expire.)

Now we need to create implementations of this trait!

## Creating an in-memory LockDao for testing

Somebody who uses the LockDao can ask for an instance of that trait, and it doesn’t matter whether it’s backed by a real database or it’s just in-memory. So when we’re testing code that uses the LockDao – but not testing a LockDao implementation specifically – we can use a simple, in-memory implementation. This makes our tests faster and easier to manage!

``````class InMemoryLockDao[Ident, ContextId] extends LockDao[Ident, ContextId] {
def lock(id: Ident, contextId: ContextId): LockResult = ???
def unlock(contextId: ContextId): UnlockResult = ???
}
``````

Because this is just for testing, we can store the locks as a map. When somebody acquires a new lock, we store the context ID in the map. Here’s what that looks like:

``````case class PermanentLock[Ident, ContextId](
id: Ident,
contextId: ContextId
) extends Lock[Ident, ContextId]

class InMemoryLockDao[Ident, ContextId] extends LockDao[Ident, ContextId] {
private var currentLocks: Map[Ident, ContextId] = Map.empty

def lock(id: Ident, contextId: ContextId): LockResult =
currentLocks.get(id) match {
case Some(existingContextId) if contextId == existingContextId =>
Right(
PermanentLock(id = id, contextId = contextId)
)
case Some(existingContextId) =>
Left(
LockFailure(
id,
new Throwable(s"Failed to lock <\$id> in context <\$contextId>; already locked as <\$existingContextId>")
)
)
case None =>
val newLock = PermanentLock(id = id, contextId = contextId)
currentLocks = currentLocks ++ Map(id -> contextId)
Right(newLock)
}

def unlock(contextId: ContextId): UnlockResult = ???
}
``````

We have to remember to look for an existing lock, and compare it to the lock that’s requested. It’s fine to call `lock()` if you already have the lock, but you can’t lock an ID that somebody else owns.

Unlocking is much simpler: we just remove the entry from the map.

``````class InMemoryLockDao[Ident, ContextId] extends LockDao[Ident, ContextId] {
def lock(id: Ident, contextId: ContextId): LockResult = ...

def unlock(contextId: ContextId): UnlockResult = {
currentLocks = currentLocks.filter { case (_, lockContextId) =>
contextId != lockContextId
}

Right(Unit)
}
}
``````

This gives us a LockDao implementation that’s pretty simple, and we can use whenever we need a LockDao in tests.

Because it’s only for testing, it doesn’t need to be thread-safe or especially robust. This code is quite simple, so we’re more likely to get it right. When a caller uses this in tests, they can trust the LockDao is behaving correctly and focus on how they use it, and not worry about bugs in the locking code.

Here’s what it looks like in practice:

``````import java.util.UUID

val dao = new InMemoryLockDao[String, UUID]()

val u1 = UUID.randomUUID
println(dao.lock(id = "1", contextId = u1))               // succeeds
println(dao.lock(id = "1", contextId = UUID.randomUUID))  // succeeds
println(dao.lock(id = "2", contextId = UUID.randomUUID))  // fails
println(dao.unlock(contextId = u1))
println(dao.lock(id = "1", contextId = UUID.randomUUID))  // succeeds
``````

We also have a small number of tests to check it behaves correctly:

• It locks an ID/context pair
• You can’t lock the same ID under different contexts
• You can lock different IDs under the same context
• You can unlock all the IDs under the same context
• When an ID is unlocked, it can be relocked in a new context

Because there’s no I/O involved, those tests take a fraction of a second to run.

## Creating a concrete implementation of LockDao

Because we work primarily in AWS, we’ve created a LockDao implementation that uses DynamoDB as a backend. This is what we use when running in production.

It fulfills the same basic contract, but it has to be more complicated. It calls the DynamoDB APIs, makes conditional updates, and it expires a lock after a fixed period if it hasn’t been released. If a worker crashes before it can release its locks, we want the system to recover automatically – we don’t want to have to clean up those locks by hand.

I’m not going to walk through it, but you can see this code in our GitHub repo (link at the end of the post).

## Creating the locking service

Now let’s build a locking service. You pass it a set of identifiers and a callback. It has to acquire a lock on each of those identifiers, get the result of the callback, then release the locks and return the result.

Here’s a stub to start us off:

``````trait LockingService[Ident] {
def withLocks(ids: Set[Ident])(callback: => ???) = ???
}
``````

For now, let’s put aside the return type of the `callback`, and acquire a lock. We’ll need a lock dao (which can be entirely generic), and a way to create context IDs:

``````trait LockingService[LockDaoImpl <: LockDao[_, _]] {
implicit val lockDao: LockDaoImpl

def withLocks(ids: Set[lockDao.Ident])(callback: => ???) = ???

def createContextId: lockDao.ContextId
}
``````

We’re asking implementations to tell us how to create a context ID, because the type of context ID will vary, as will the rules for creation. Maybe it’s a worker ID, or a thread ID, or a random ID used once and discarded immediately after.

Then we need to acquire the locks on all the identifiers we’ve received. If we get them all, we can call the callback – but if any of the locks fail, we should release anything we’ve already locked and return without invoking the callback.

Let’s write a method for acquiring the locks:

``````import grizzled.slf4j.Logging

trait FailedLockingServiceOp

case class FailedLock[ContextId, Ident](
contextId: ContextId,
lockFailures: Set[LockFailure[Ident]]) extends FailedLockingServiceOp

trait LockingService[LockDaoImpl <: LockDao[_, _]] extends Logging {
...

type LockingServiceResult = Either[FailedLockingServiceOp, lockDao.ContextId]

def getLocks(
ids: Set[lockDao.Ident],
contextId: lockDao.ContextId): LockingServiceResult = {
val lockResults = ids.map { lockDao.lock(_, contextId) }
val failedLocks = getFailedLocks(lockResults)

if (failedLocks.isEmpty) {
Right(contextId)
} else {
unlock(contextId)
Left(FailedLock(contextId, failedLocks))
}
}

private def getFailedLocks(
lockResults: Set[lockDao.LockResult]): Set[LockFailure[lockDao.Ident]] =
lockResults.foldLeft(Set.empty[LockFailure[lockDao.Ident]]) { (acc, o) =>
o match {
case Right(_)         => acc
case Left(failedLock) => acc + failedLock
}
}

private def unlock(contextId: lockDao.ContextId): Unit =
lockDao
.unlock(contextId)
.leftMap { error =>
warn(s"Unable to unlock context \$contextId fully: \$error")
}
}
``````

The main entry point is `getLocks()`, which gets both the IDs and the context ID we’ve created. As in the InMemoryLockDao, this returns an `Either[…]`, so we get nice context about any locking failures.

First we call `lockDao.lock(…)` on every ID, which gives us a list of `LockResult`s. We look for any failures with `getFailedLocks()` – if there are any, we try to release the locks we’ve already taken, and return a Left. If all the locks succeed, we get a Right.

The unlocking happens in `unlock()`. It attempts to unlock everything, but an unlock failure just gets a warning in the logs, not a full-blown error. We’re already bubbling up an error for the locking failure, and we didn’t think it worth exposing those extra errors. And if the callback succeeds but the unlocking fails, the operation as a whole is still a success and worth returning to the caller.

Then we have to actually invoke the callback, and this bit gets interesting. We want this service to be very generic, and handle different types of function. The callback might return a Future, or a Try, or an Either, or something else. We want to preserve that return type, and combine it with possible locking errors.

So we added another pair of type parameters:

``````trait LockingService[Out, OutMonad[_], ...] {
...

type Process = Either[FailedLockingServiceOp, Out]

def withLocks(
ids: Set[lockDao.Ident])(
}
``````

We’re starting to get into code that uses more advanced functional programming, and in particular the Cats library. Robert and I were reading the book Scala with Cats as we wrote this code. It’s a free ebook, and I’d recommend it if you want more detail.

Let’s go through this code carefully.

We’ve added two new type parameters: `Out` and `OutMonad[_]`, so the return type of our callback is `OutMonad[Out]`. What’s a monad?

This is the definition that works for me: a type `F` is a monad if:

• It has a type constructor `F[_]` that takes exactly one type parameter.

• There’s a function that takes a value of any type `A` and produces a value of type `F[A]`. This is called a monadic unit.

• If you have a function `A => F[B]`, and a function `B => F[C]`, you can combine these functions to get a single function `A => F[C]`. This is called monadic composition.

Some examples of monads in Scala include `List[_]`, `Option[_]` and `Future[_]`. They all take a single type parameter, have a monadic unit, and you can compose them with `flatMap`.

So we expect our callback to return a monad wrapping another type. Inside the service, we’ll get an Either which contains the result of the callback or the locking service error, and then we’ll wrap that Either in the monad type. We’re preserving the monad return type of the callback.

For example, if our callback returns `Future[Int]`, then `OutMonad` would be `Future` and `Out` would be `Int`. The `withLocks(…)` method then returns `Future[Either[FailedLockingServiceOp, Int]]`.

But what if our callback doesn’t return a monad? What if it returns a type like `Int` or `String`? Here we’ll use a bit of Cats: we can imagine these types as being wrapped in the identity monad, `Id[_]`. This is the monad that maps any value to itself, i.e. `id(a: A) = a`.

So even if the callback code isn’t wrapped in an explicit monad, the compiler can still assign the type parameter `OutMonad`, by imagining it as `Id[_]`.

So now we know what type our callback returns, let’s actually call it inside the locking service. For now, assume we’ve already successfully acquired the locks, and we want to run the callback.

``````import cats.MonadError

case class FailedProcess[ContextId](contextId: ContextId, e: Throwable)
extends FailedLockingServiceOp

...

type Process = Either[FailedLockingServiceOp, Out]

def unlock(contextId: ContextId): UnlockResult = ...

import cats.implicits._

def safeCallback(contextId: lockDao.ContextId)(
val partialResult: OutMonad[Process] = callback.map { out =>
unlock(contextId)
Either.right[FailedLockingServiceOp, Out](out)
}

unlock(contextId)
Either.left[FailedLockingServiceOp, Out](FailedProcess(contextId, err))
}
}
}
``````

We’re bringing in more stuff from Cats here. The type we’ve just imported, `MonadError`, gives us a way to handle errors that happen inside monads – for example, an exception thrown inside a Future.

We call the callback, and wait for it to return (for example, a Future doesn’t return immediately). If it returns successfully, we map over the result, unlock the context ID, and wrap the result in a Right. We’ve imported `cats.implicits._` so we can map over `OutMonad` and preserve its type. This the happy path.

If something goes wrong, we use the `MonadError` to handle the error, unlock the context ID, and then wrap the result in a Left. Using this Cats helper ensures we handle the error correctly, and it gets wrapped in the appropriate monad type at te end. This is the sad path.

Either way, we’re waiting for the callback to return and then releasing the locks.

If we had a concrete type like `Future` or `Try`, we’d know how to wait for the result. Instead, we’re handing that off to Cats.

Now we have all the pieces we need to actually write out `withLocks` method, and here it is:

``````import cats.data.EitherT

...

type LockingServiceResult = Either[FailedLockingServiceOp, lockDao.ContextId]

def getLocks(
ids: Set[lockDao.Ident],
contextId: lockDao.ContextId): LockingServiceResult = ...

def withLocks(ids: Set[lockDao.Ident])(
val contextId: lockDao.ContextId = createContextId

val eitherT = for {
getLocks(ids = ids, contextId = contextId)
)

out <- EitherT(safeCallback(contextId)(callback))
} yield out

eitherT.value
}
}
``````

Hopefully you recognise all the arguments to the function – the IDs to lock over, the callback, and the implicit MonadError (which will be created by Cats).

That `EitherT` in the for comprehension is another Cats helper. It’s an Either transformer – if you have a monad type `F[_]` and types `A` and `B`, then `EitherT[F[_], A, B]` is a thin wrapper for `F[Either[A, B]]`. It lets us easily swap the `Either` and the `F[_]`.

In the first case, it takes the result of `getLocks()` and wraps it in `OutMonad`.

If getting the locks succeeds, then it calls `safeCallback()` and wraps that in an `EitherT` as well. Once that returns, it extracts the value of the underlying `OutMonad[Either[_, _]]` and returns that result.

And that’s the end of the locking service! In barely a hundred lines of Scala, we’ve implemented all the logic for a locking service – and it’s completely independent of the underlying database implementation.

## Putting the locking service to use

We can combine the generic locking service with the in-memory lock dao, and get an in-memory locking service. Because all the logic is in the type class, this is really short:

``````import java.util.UUID

val lockingService = new LockingService[String, Try, LockDao[String, UUID]] {
override implicit val lockDao: LockDao[String, UUID] =
new InMemoryLockDao[String, UUID]()

override def createContextId: UUID =
UUID.randomUUID()
}
``````

This is perfect for testing the locking service logic – because it’s in-memory, it runs really quickly, and we can write lots of tests to check it behaves correctly. Our tests cases include checking that it:

• Acquires the locks in the underlying dao
• Returns the result of a successful callback
• Returns a failure if you try to re-lock an already locked identifier, and preserves the original locks
• Allows you to nest locks
• Releases the lock when the callback completes (both success and failure), and allows you to re-lock
• Releases all of its locks if it fails to lock a complete set of IDs
• Succeeds even if unlocking fails

And those tests run in a fraction of a second! Because everything happens in memory, it’s incredibly fast.

And when we have code that uses the locking service, we can drop in the in-memory version for testing that, as well. It makes tests simpler and cleaner elsewhere in the codebase.

When we want an implementation to write in production, we can combine it with a LockDao implementation and get a new locking service implementation. This is the entirety of our DynamoDB locking service:

``````class DynamoLockingService[Out, OutMonad[_]](
implicit val lockDao: DynamoLockDao)
extends LockingService[Out, OutMonad, LockDao[String, UUID]] {

override protected def createContextId(): lockDao.ContextId =
UUID.randomUUID()
}
``````

This is the beauty of doing it in a type class – we can swap out the implementation and not have to rewrite any of the tricky lock/unlock logic. It’s a really generic and reusable implementation.

## Putting it all together

All the code this post was based on is in a public GitHub repository, wellcometrust/scala-storage, which is a collection of our shared storage utilities (mainly for working with DynamoDB and S3). These are the versions I worked from:

• Managing individual locks

• Creating an in-memory LockDao for testing

• Creating a concrete implementation of LockDao

• Creating the locking service

• Putting the locking service to use

I’ve also put together a mini-project on GitHub with the code from this blog post alone. It has both the type classes, the in-memory LockDao implementation, and a small example that exercises both classes. All the code linked above (and in this post) is available under the MIT licence.

Writing this blog post was a useful exercise for me. If I want to explain this code, I have to really understand it. There’s no room to handwave something and say “this works, but I’m not sure why”.

And it makes the code better too! As I was writing this post, I spotted several places where the original code was unclear or inefficient. I’ll push those fixes back to the codebase – so not only is this blog post an explanation for future maintainers, but the code itself is clearer as well.

I can’t do this sort of breakdown for all the code I write, but I recommend it if you’re every writing especially complex or tricky code.

1. Eventually every file will be stored in multiple S3 buckets, all with versioning and Object Locks enabled. We’ll also be saving a copy in another geographic region and with another cloud provider, probably Azure. ↩︎

# Finding unused variables in a Terraform module

At work, we use Terraform to manage our infrastructure in AWS. We use modules to reduce repetition in our Terraform definitions, and we publish them in a public GitHub repo. A while back, I wrote a script that scans our modules and looks for unused variables, so that I could clean them all up.

In this post, I’m going to walk through the script and explain how it works. If you just want the script, you can skip to the end.

## What variables are defined by a single Terraform file?

There’s a Python module for parsing HCL (the Terraform markup language), so let’s use that – much easier and more accurate than trying to detect variables manually. Here’s what that looks like:

``````import hcl

def get_variables_in_file(path):
try:
with open(path) as tf:
except ValueError as err:

try:
return set(tf_definitions["variable"].keys())
except KeyError:
return set()
``````

The `hcl.load` method does the heavy lifting. It returns a dictionary, where the keys are the different elements of the Terraform language – `resource`, `variable`, `provider`, and so on. Within the dictionary for each element, you get every instance of that element in the file.

For example, the following Terraform definition:

``````variable "queue_name" {
description = "Name of the SQS queue to create"
}

resource "aws_sqs_queue" "q" {
name            = "\${var.queue_name}"
}

resource "aws_sqs_queue" "dlq" {
name = "\${var.queue_name}_dlq"
}
``````

gets a dictionary a bit like this:

``````{
"resource": {
"aws_sqs_queue": {
"dlq": ...,
"q": ...
}
},
"variable": {
"queue_name": ...
}
}
``````

Getting the list of keys in the `variable` block (if it’s present) tells us the variables defined in this file.

Sometimes you’ll discover the Terraform inside a file is just malformed (or the file is empty!) – so we wrap the exception we receive to include the file path. The `from None` disables exception chaining in Python 3, and makes the traceback a little cleaner.

## What variables are defined by a Terraform module?

Once we can get the variables defined by a single file, we can get all the variables defined in a module.

A module is a collection of Terraform files in the same directory, so we can find them by using `os.listdir`, like so:

``````import os

def tf_files_in_module(dirname):
for f in os.listdir(dirname):
if f.endswith(".tf"):
yield f

def get_variables_in_module(dirname):
all_variables = {}

for f in tf_files_in_module(dirname):
for varname in get_variables_in_file(os.path.join(dirname, f)):
all_variables[varname] = f

return all_variables
``````

This returns a map from (variable name) to (file where the variable was defined). If a variable turns out to be redundant, knowing which file it was defined in will be helpful when we go back to delete it.

## Does a module have any unused variables?

Once we have a list of variables defined in a module, we need to go back to see which of them are in use. I haven’t found such a good way to do this – right now the best I’ve come up with is to look for the string `var.VARIABLE_NAME` in all the files. It’s a bit crude, but seems to work.

Here’s the code:

``````def find_unused_variables_in_module(dirname):
unused_variables = get_variables_in_module(dirname)

for f in tf_files_in_module(dirname):
if not unused_variables:
return {}

for varname in list(unused_variables):
if f"var.{varname}" in tf_src:
del unused_variables[varname]

return unused_variables
``````

We start by getting a list of all the variables defined in the module.

Then we go through the files in the module, one-by-one. If we don’t have any unused variables left, we can exit early – checking the rest of the files won’t tell us anything new. Otherwise, we open the file, read the Terraform source, and look for instances of the variables we haven’t seen used yet. If we see a variable in use, we delete it from the dict.

We have to iterate over `list(unused_variables)` rather than `unused_variables` itself, because we’re deleting elements from that dict as we go along. If you don’t make it a list first, you’ll get an error when you delete the first element: “dictionary changed size during iteration”.

If the module uses all of its variables, we get back an empty dict. If there are unused variables, we get a dict that tells us which variables aren’t being used, and which file they’re defined in.

## Looking at all the modules in a repo

Our terraform-modules repo defines dozens of modules, and I wouldn’t want to check them all by hand. Instead, it’s easier (and faster!) to use os.walk to look through every directory in the repo. For a quick speedup, we can look for filenames ending with `.tf` to decide if a particular directory is a module.

Here’s some code:

``````def find_unused_variables_in_tree(root):
for mod_root, _, filenames in os.walk(root):
if not any(f.endswith(".tf") for f in filenames):
continue

unused_variables = find_unused_variables_in_module(mod_root)

if unused_variables:
print(f"Unused variables in {mod_root}:")
for varname, filename in unused_variables.items():
print(f"* {varname} ~> {os.path.join(mod_root, filename)}")
print("")
``````

And I wrap that in a little main block:

``````import sys

if __name__ == "__main__":
try:
root = sys.argv[1]
except IndexError:
root = "."

find_unused_variables_in_tree(root)
``````

This means I can pass a directory to the script, and it looks for unused modules under that directory – or if I don’t pass an argument, it looks in the current directory.

## Putting it all together

Here’s the final version of the code:

``````import os
import sys

import hcl

def get_variables_in_file(path):
try:
with open(path) as tf:
except ValueError as err:

try:
return set(tf_definitions["variable"].keys())
except KeyError:
return set()

def tf_files_in_module(dirname):
for f in os.listdir(dirname):
if f.endswith(".tf"):
yield f

def get_variables_in_module(dirname):
all_variables = {}

for f in tf_files_in_module(dirname):
for varname in get_variables_in_file(os.path.join(dirname, f)):
all_variables[varname] = f

return all_variables

def find_unused_variables_in_module(dirname):
unused_variables = get_variables_in_module(dirname)

for f in tf_files_in_module(dirname):
if not unused_variables:
return {}

for varname in list(unused_variables):
if f"var.{varname}" in tf_src:
del unused_variables[varname]

return unused_variables

def find_unused_variables_in_tree(root):
for mod_root, _, filenames in os.walk(root):
if not any(f.endswith(".tf") for f in filenames):
continue

unused_variables = find_unused_variables_in_module(mod_root)

if unused_variables:
print(f"Unused variables in {mod_root}:")
for varname, filename in unused_variables.items():
print(f"* {varname} ~> {os.path.join(mod_root, filename)}")
print("")

if __name__ == "__main__":
try:
root = sys.argv[1]
except IndexError:
root = "."

find_unused_variables_in_tree(root)
``````

When I originally ran this script, it turned up a lot of unused variables, and I cleaned up the entire repo in one go. I don’t use it very often, because the modules don’t change as much as they used to, but it’s useful to have it. I run it once in a blue moon, and clean up anything it tells me about.

It even exposed a few bugs! It flagged a variable as being unused, even though it was one we expected the module to be using. When I went to look, I found a configuration error or a typo. Fix that, the variable is now in use, and the script is happy.

I’ve also used this code to look for unused `local`s – but I’ll leave that as an exercise for the reader.

# Reversing a t.co URL to the original tweet

If you post a link on Twitter, it goes through Twitter’s t.co link-shortening service. The link in the tweet text is replaced with a t.co URL, and that URL redirects to the original destination.

If you’re just reading Twitter, the presence of t.co is mostly invisible – it’s not shown in the interface, and if you click on a URL you get to the original destination.

A t.co URL is an HTTP 301 Redirect to the destination, which any browser or HTTP client can follow (as long as Twitter keeps running the service). For example:

``````>>> import requests
>>> resp.status_code
301
'https://www.bbc.co.uk/news/blogs-trending-47975564'
``````

But what if you only have the t.co URL, and you want to find the original tweet? For example, I see t.co URLs in my referrer logs – people linking to my blog – and I want to know what they’re saying about me!

Twitter don’t provide a public API for doing this, so there’s no perfect way to reverse a t.co URL back to its source. I have found a couple of ways to do it, and in this post I’ll explain how.

## The manual approach

If you search for a t.co URL in Twitter, you can see tweets which include it. If the twet is recent and visible to you, it shows up in the results:

Sometimes you might find multiple tweets that include the same URL. I’ve seen this happen when somebody posts the same link several times:

If you only need to search for a couple of URLs, this is probably fine.

## The Python approach

Because I need to do this a lot, I wanted to automate the process Twitter have a search API which provides similar data to the Twitter website, so by calling this API we can mimic the search interface. I wrote a Python script to do it for me, which I’ll walk through below

First we need to authenticate with the Twitter API. You’ll need some Twitter API credentials, which you can get through Twitter’s developer site.

In the past I used tweepy to connect to the Twitter APIs, but these days I prefer to use the requests-oauthlib library and make direct requests. We create an OAuth session:

``````from requests_oauthlib import OAuth1Session

sess = OAuth1Session(
)
``````

Then we can call the search API like so:

``````resp = sess.get(
params={
"q": TCO_URL,
"count": 100,
}
)
``````

The `q` parameter is the search query, which in this case is the t.co URL. We get as many tweets as possible (you’re allowed up to 100 tweets in a single request).

We extract the tweets like so:

``````statuses = resp.json()["statuses"]
``````

The API represents every retweet as an individual status, so a tweet with three retweets would have four entries in this response – one for the original tweet, and three more for each of the retweets. The Twitter web UI handles that for us, and consolidates them into a single result. We have to do that manually.

If a tweet from the API is a retweet, it has a `retweeted_status` key that contains the original tweet. Let’s look for that, and build tweet URLs accordingly:

``````tweet_urls = set()

for status in statuses:
try:
tweet = status["retweeted_status"]
except KeyError:
tweet = status

tweet["user"]["screen_name"], tweet["id_str"]
)

``````

This gives us the URLs for tweets that use or mention the t.co URL we were looking for.

If we want to be stricter, we could check that these tweets include the t.co short URL in their URL entities. (In the Twitter API, an “entity” is metadata or extra context for the tweet – images, videos, URLs, that sort of thing.) We add `"include_entities": True` to the parameters in our API call, then modify our `for` loop slightly:

``````for status in statuses:
...

if not any(u["url"] == TCO_URL for u in tweet["entities"]["urls"]):
continue

url = "..."
``````

Putting this all together gives us the following function:

``````from requests_oauthlib import OAuth1Session

sess = OAuth1Session(
)

def find_tweets_using_tco(tco_url):
"""
Given a shortened t.co URL, return a set of URLs for tweets that use this URL.
"""
resp = sess.get(
params={
"q": tco_url,
"count": 100,
"include_entities": True
}
)

statuses = resp.json()["statuses"]

tweet_urls = set()

for status in statuses:
# A retweet shows up as a new status in the Twitter API, but we're only
# interested in the original tweet.  If this is a retweet, look through
# to the original.
try:
tweet = status["retweeted_status"]
except KeyError:
tweet = status

# If this tweet shows up in the search results for a reason other than
# "it has this t.co URL as a short link", it's not interesting.
if not any(u["url"] == tco_url for u in tweet["entities"]["urls"]):
continue

tweet["user"]["screen_name"], tweet["id_str"]
)

return tweet_urls
``````

I’ve been using this code to reverse t.co URLs that appear in my web analytics for a while now. It works about as well as the website but I find it quicker to use.

## Limitations

Not all t.co URLs come from a tweet.

If you post a link in your profile, that gets shortened as well. But as far as I can tell, there’s no way to go from a shortened profile link back to the original profile page. If you search for the shortened URL, you don’t find anything.

Also, if the original tweet is an account you can’t see (maybe they’re private or they’ve blocked you), their tweet won’t show up in your searches.

# Some tips for conferences

My first tech conference was PyCon UK, back in September 2016. Since then, I’ve been to a dozen or so tech conferences – most recently ACCU 2019 – and I’m enjoying them more now than when I started. This post is a list of some of the things I’ve learnt that make conferences more enjoyable.

The short version: when to go to sessions and when to have conversations, pace yourself for socialising, and pack carefully.

### Distinguish between “must see” and “nice to see” sessions

When I was first going to conferences, I tried to go to a talk or workshop in every slot. That’s fine, but sessions aren’t the only important thing at a conference – the conversations between sessions are important too! I had several conversations that I cut off to go to a session where, in hindsight, I might have better off skipping the session and continuing the conversation. Most conferences video their sessions, so I could have caught up later.

These days, I split sessions into “must see” and “nice to see”. It helps me decide if I really want to end a conversation and go to a session, or if I’d rather stay and chat.

### Know how to end a conversation respectfully

Conversations are important, but sometimes they aren’t going anywhere. That’s okay too!

When I think I’ve hit a dead end, I say something like “It was lovely to chat to you, and now I’m going to talk to some other people”, and offer a handshake. It’s polite, respectful, and nobody has ever been upset when I say that. It leaves a good final impression.

Don’t flub it with a feeble excuse about going to the toilet or fetching a drink, then not coming back. You’re leaving the conversation, so own it.

The Pac-Man rule is an idea from Eric Holscher, which at its core is this: When standing as a group of people, always leave room for 1 person to join your group.

That physical gap helps people feel like they can join the group. It’s a nice way to help newcomers feel included, and for you to meet new people. For more explanation, I recommend Eric’s original blog post.

### When somebody joins the conversation, give them some context

This is a tip I got from Samathy Barratt at ACCU.

When somebody joins your conversation, give them some quick context so they know what you were just talking about. It doesn’t have to be much; just a sentence or two will do. For example, “We’re talking about exception handling in C++.” It implicitly welcomes them to the conversation, and means they can take part more quickly – they don’t have to try to guess the context.

### Expect to crash after (or during) the conference

Conferences can be very intense – you’re meeting lots of people, learning new information, having conversations – and that can be tiring.

During a conference, I always put aside time to rest, away from the bustle of the conference. Whether that’s in the quiet room, a nearby green space, or just in the corridor while everyone else is in a session, it helps me recharge and be enjoy the next part of the conference.

After the conference ends, I usually have an emotional crash. I’ve spent a few days meeting people and spending time with friends I don’t usually see, and coming down from that is hard. I always plan a quiet day at home (and some annual leave at work) after a long trip.

### Plan to visit the location beforehand, not after

For the last two years I’ve stayed in Cardiff for a few days after PyCon UK ends. I wanted to rest and see a bit of the city, but it was tinged with melancholy. It was weird to wake up, walk through Cardiff, and there was nobody from the conference. All my friends had gone home; it was just me left.

Next time I’ll try to visit before the conference starts, and go home at the same time as everyone else.

### Stuff to pack

I have a fairly long checklist of things to pack for away-from-home travel. These are a few items that I find especially useful for conferences:

• An external phone battery. Your phone will probably get lots of use during the day, and you don’t want it running out during the evening. (Particularly if you’re making dinner plans.) I carry a lipstick-sized Anker battery to keep my phone topped up all day long.

• Paper and pen. Typing is great, but I still prefer the flexibility of pen and paper for scribbling notes during a talk.

• Water bottle. Don’t get dehydrated!

• A/V kit (if you’re presenting). I have a travelling tech bag that I take to events which has all my A/V adapters and a spare remote. You don’t need all of that, but if you’re presenting it’s useful to at least have your own set of adapters. This has saved me hassle more than once.

• Medication. If you have regular meds you don’t need a reminder to bring them along. But there are probably other meds you take on occasion, and they might be useful too – I always throw in a pack of painkillers and hayfever tablets when I’m going on a trip. I carry them all day, so I never lose conference time finding a pharmacy in a strange city.

# Getting a transcript of a talk from YouTube

When I give conference talks, my talks are often videoed and shared on YouTube. Along with the video, I like to post the slides afterwards, and include an inline transcript. A written transcript is easier to skim, to search, and for Google to index. Plus, it makes the talk more accessible for people with hearing difficulties. Here’s an example from PyCon UK last year: Assume Worst Intent.

I share a transcript rather than pre-prepared notes because I often ad lib the content of my talks. I might add or remove something at the last minute, make subtle changes based on the mood of the audience, or make a reference to a previous session that wasn’t in my original notes. A transcript is a more accurate reflection of what I said on the day.

Some conferences have live captioning (a human speech-to-text reporter transcribing everything I say, as I write it), which does the hard work for me! That’s great, and those transcripts are very high quality – but not every event does this.

If I have to do it myself, writing a new transcript is a lot of work, and slows down posting the slides. So what I do instead is lean on YouTube to get a first draft of a transcript, and then I tidy it up by hand.

YouTube uses speech-to-text technology to automatically generate captions for any video that doesn’t already have them (in a handful of languages, at least). It’s not fantastically accurate, but it’s close enough to be a useful starting point. I can edit and polish the automatically generated transcript much faster than I could create my own from scratch.

## How I do it

``````\$youtube-dl --write-auto-sub --skip-download "https://www.youtube.com/watch?v=XyGVRlRyT-E"
``````

This saves a `.vtt` subtitle file in the current directory.

The `.vtt` file is a format meant for video players – it describes what words should appear on the screen, when. Here’s a little snippet:

``````00:00:00.030 --> 00:00:03.500 align:start position:0%

again<c.colorE5E5E5><00:00:01.669><c> since</c><00:00:02.669><c> you've</c><00:00:02.790><c> already</c><00:00:02.970><c> heard</c></c><c.colorCCCCCC><00:00:03.300><c> from</c><00:00:03.449><c> me</c></c>

00:00:03.500 --> 00:00:03.510 align:start position:0%
again<c.colorE5E5E5> since you've already heard</c><c.colorCCCCCC> from me
</c>
``````

It’s a mixture of timestamps, colour information, and the text to display. To turn this into something more usable, I have a Python script that goes through and extracts just the text. It’s a mess of regular expressions, not a proper VTT parser, but it does the trick. You can download the script from GitHub.

This gives me just the content of the captions:

``````again since you've already heard from me
before I'll skip the introduction and
gets right into the talk we're talking
``````

I save that to a file, then I go through that text to add punctuation and tidy up mistakes. If it’s not clear from the transcript what I was saying, I’ll go back and rewatch the video, but I only need a few seconds at a time.

## Observations

The YouTube auto-captioning software is good, but far from perfect. Here are a couple of changes I’m especially used to making:

• It really struggles on proper names. If I have a human captioner on the day, I’ll tell them the names in advance so they know what to expect, but there’s no way to do the same for YouTube.

• It prefers US spellings for words, like “color” or “favorite” or “realize”. Since I use British spellings on this blog, I change all of those.

• It can struggle with homophones. A recent example: “you all write documentation” became “you all right documentation”, which sounds the same but makes less sense.

• It’s very unforgiving with my verbal ticks. I see just how often I say phrases like “I think”, “so” and “because”. This is useful feedback for me, but I edit them out of the finished transcript, because they’re just verbal noise.

Overall it’s a lot faster than writing a transcript from scratch, and a lot kinder to my hands. I spend most of my time reading, not typing, and it takes much less time from start to finish.

If you need some captions and you don’t have the time or money for a complete human transcript, the YouTube auto-generated captions are a good place to start.

# How I back up my computing devices, 2019 edition

About a fortnight ago, there was lots of news coverage about Myspace losing 12 years of uploaded music. I never had a Myspace account, so I didn’t lose anything on this occasion, but it was a prompt to think about how I back up my computing devices.

A lot of my work and documents only exist on a computer. That includes most of my personal photographs, all my code and prose, and many of the letters I receive (physical copies of which get scanned and shredded). It’s scary to imagine losing any of that data, so I have a number of systems to keep it backed up and secure.

These are the notes I made on my backup system.

## Requirements

These are the things I think make a good backup system:

• Redundancy. If there’s a file I care about, I should have at least two copies, ideally three. Two is one, one is none.

• No single points of failure. I want to spread my copies around, so it’s very difficult for them all to be deleted at once. That includes:

• Buying hard drives from different batches (to avoid a defect in a single batch)
• Offsite backups (so my house burning down wouldn’t destroy all my backups)
• Using different software for different backups (so a bug in one product isn’t catastrophic)
• Fast recovery. My day job involves using a computer. If I have to wait hours or days to recover key data from a disk failure, that affects my ability to do my job.

• Automated backups. If I have to remember to do something, it won’t happen very often. If I can get the computer to perform my backups automatically, they’ll be more reliable and I’ll have more up-to-date backups.

## My devices

I have three devices that have important data:

• An always-on iMac, which is my main computer at home
• A MacBook which I use when I’m away from home
• An iPhone

I also have a work laptop, but I let IT manage its backups. It has less data that I personally care about, and corporate IT policies tend to frown upon people making unauthorised copies of company data.

I also have a lot of data tied up in online accounts (Twitter, Dreamwidth, Tumblr, and so on), and I try to keep separate copies of that. How I back up that data is a subject for a separate post.

## My setup

Because my iPhone and my laptop are both portable devices, and I take them out of the house regularly, I assume I could lose or break them at any time. (Many years ago, I lost my first two phones in quick succession.) I try not to keep important files on them for long, and instead copy the files to my iMac – where they get backed up in multiple ways.

Here’s what I do to secure my files:

### Full-disk encryption

My scanned documents have a lot of personal information, including my bank details, home address, and healthcare records. I don’t want that to be readily available if my phone or laptop get stolen, so I do full-disk encryption on both of them.

On my iMac and MacBook, I’ve turned on FileVault encryption. On my iPhone, I’m using the encryption provided by iOS.

### iCloud Photo Stream and iCloud Backups

Any photos I take on my iPhone are automatically uploaded to iCloud Photo Stream, and I have an iCloud backup of the entire phone. My iMac downloads the original, full-resolution file for every photo I store in Photo Stream, so I’m not relying on Apple’s servers. Because the iMac is always running, it usually downloads an extra copy very quickly.

When I’m using a camera with an SD card, I transfer photos off the SD card to my phone at the end of the day, and I upload those to iCloud Photo Stream as well.

I’m paying for a 200GB iCloud storage plan (£2.49/month), which is easily enough for my needs.

### File sync with Dropbox and GitHub

When I’m actively working on something, I keep the relevant files on GitHub (if it’s code) or Dropbox (if it’s not). That’s a useful short-term copy of all those files, and keeps them in sync between devices.

### Two full disk clones of my iMac, kept at home

I have a pair of Western Digital hard drives plugged into my iMac, and I use SuperDuper to create bootable clones of its internal drive every 24 hours. One backup runs in the early morning before I start work, one in the late evening when I’m in bed.

I space out the clones to reduce the average time since the last backup, and to give me more time to spot if SuperDuper is having issues before it affects both drives.

The drives are permanently mounted; ideally I’d only mount them when SuperDuper is creating a clone.

Both these drives are encrypted with FileVault. They never leave my desk, but it means I don’t have to worry about a burglar getting my personal data.

### A full disk clone of my iMac, kept at the office

I have a portable bus-powered Seagate hard drive, and SuperDuper creates a bootable clone of my iMac whenever it’s plugged in. This disk usually lives in a drawer at work, thirty miles from home, so if my home and the local drives are destroyed (say, by fire or flood), I still have an easy-to-hand backup.

Once a fortnight, I bring the drive home, plug it into the iMac, and update the clone.

I encrypt this drive so it’s not a disaster if I lose it somewhere between home and the office.

Both this and the permanently plugged-in drives are labelled with their initial date of purchase. Conventional wisdom is that hard drives are reliable for about 3–4 years; the label gives me an idea of whether it’s time to replace a particular drive.

### Remote backups with Backblaze

I run Backblaze to continuously make backups of both my iMac and my MacBook.

This is a last resort. Restoring my entire disk from Backblaze would be slow and expensive, but it means that even if all my physical drives are destroyed, I have an extra copy of my data.

But it’s handy at other times, even if I’m not doing a complete restore – if I’m on my laptop and I realise I need a file that’s only on my iMac, I can restore a single file from Backblaze. It’s a good way to shuffle files around in a pinch.

## Keeping it up-to-date

The most recent addition to this setup is the portable iMac clone.

When I was moving house last year, I had my iMac and all my backup drives in the same car. If I’d had an accident, all my backups would disappear at once, and I’d be stuck downloading 600GB of files from Backblaze. The extra drive was a small cost, but should make it much easier to restore if that worst-case scenario ever happens.

Soon I need to replace the other drives plugged into my iMac – they’re both three years old, and approaching the end of their reliable lives. The current pair are both desktop hard drives, with dedicated power supplies. I’ll probably replace them with bus-powered, portable drives, to tidy up my desk.

I don’t have any local backups of my laptop, and I’m not planning to change that. The only files I keep on the laptop are things I’m actively working on, which also go in Dropbox and GitHub.

So that’s my backup system.

It’s not perfect, but I’m happy with it. My last drive failure was three years ago, and I didn’t lose a single file. I don’t lose sleep wondering if a disk is about to fail and lose all my data.

If you already have a backup system in place, use the Myspace disaster as a prompt to review it. Are there gaps? Single points of failure? Could it be improved or made more resilient?

And if you don’t have a backup system, please get one! Data loss is miserable, and your disk is going to fail – it’s only a matter of when, not if.

# Creating a GitHub Action to auto-merge pull requests

GitHub Actions is a new service for “workflow automation” – a sort-of scriptable GitHub. When something happens in GitHub (you open an issue, close a pull request, leave a comment, and so on), you can kick off a script to take further action. The scripts run in Docker containers inside GitHub’s infrastructure, so there’s a lot of flexibility in what you can do.

If you’ve not looked at Actions yet, the awesome-actions repo can give you an idea of the sort of things it can do.

I love playing with build systems, so I wanted to try it out – but I had a lot of problems getting started. At the start of March, I tweeted in frustration:

I really like the idea of GitHub Actions, but every time I try to use it I feel like an idiot.

I don’t know if it’s me or the UI, but I cannot understand what it’s doing, and things seem to be happening at random.

If something makes me feel stupid, I’m not going to use it.

A few days later, I got a DM from Angie Rivera, the Product Manager for GitHub Actions. We arranged a three-way call with Phani Rajuyn, one of GitHub’s software engineers, and together we spent an hour talking about Actions. I was able to show them the rough edges I’d been hitting, and they were able to fill in the gaps in my understanding.

After our call, I got an Action working, and I’ve had it running successfully for the last couple of weeks.

In this post, I’ll explain how I wrote an Action to auto-merge my pull requests. When a pull request passes tests, GitHub Actions automatically merges the PR and then deletes the branch:

If you just want the code, skip to the end or check out the GitHub repo.

## The problem

I have lots of “single-author” repos on GitHub, where I’m the only person who ever writes code. The source code for this blog is one example; my junk drawer repo is another.

I have CI set up on some of those repos to run tests and linting (usually with Travis CI or Azure Pipelines). I open pull requests when I’m making big changes, so I get the benefit of the tests – but I’m not waiting for code review or approval from anybody else. What used to happen is that I’d go back later and merge those PRs manually – but I’d rather they were automatically merged if/when they pass tests.

Here’s what I want to happen:

• I open a pull request
• A CI service starts running tests, and they pass
• The pull request is merged and the branch deleted

This means my code is merged immediately, and I don’t have lingering pull requests I’ve forgotten to merge.

I’ve experimented with a couple of tools for this (most recently Mergify), but I wasn’t happy with any of them. It felt like GitHub Actions could be a good fit, and give me lots of flexibility in deciding whether a particular pull request should be merged.

## Creating a “Hello World” Action

Let’s start by creating a tiny action that just prints “hello world”. Working from the example in the GitHub Actions docs, create three files:

``````# .github/main.workflow
workflow "on pull request pass, merge the branch" {
resolves = ["Auto-merge pull requests"]
on       = "check_run"
}

action "Auto-merge pull requests" {
uses = "./auto_merge_pull_requests"
}
``````
``````# auto_merge_pull_requests/Dockerfile
FROM python:3-alpine

MAINTAINER Alex Chan <alex@alexwlchan.net>

LABEL "com.github.actions.name"="Auto-merge pull requests"
LABEL "com.github.actions.description"="Merge the pull request after the checks pass"
LABEL "com.github.actions.icon"="activity"
LABEL "com.github.actions.color"="green"

COPY merge_pr.py /
ENTRYPOINT ["python3", "/merge_pr.py"]
``````
``````# auto_merge_pull_requests/merge_pr.py
#!/usr/bin/env python
# -*- encoding: utf-8

if __name__ == "__main__":
print("Hello world!")
``````

The Dockerfile and Python script define a fairly standard Docker image, which prints `"Hello world!"` when you run it. This is where we’ll be adding the interesting logic. I’m using Python instead of a shell script because I find it easier to write safe, complex programs in Python than in shell.

Then the `main.workflow` file defines the following series of steps:

• When the `check_run` event fires, run the `Auto-merge pull requests` action
• When the action runs, build and run the Docker image defined in `./auto_merge_pull_requests`
• When the Docker image runs, print `"Hello world!"`

I had a lot of difficulty understanding how the `check_run` event works, and Phani and I spent a lot of time discussing it on our call.

A check run is a third-party CI integration, like Travis or Circle CI. A check run event is fired whenever the state of a check changes. That includes:

• When the check is scheduled (GitHub tells Travis “please run these tests”)
• When a check starts (Travis tells GitHub “I have started running these tests”)
• When a check completes (Travis tells GitHub “These tests are finished, here is the result”)

That last event is what’s interesting to me – if the tests completed and they’ve passed, I want to take further action.

What confused me is that not all CI integrations use the Checks API – in particular, a lot of my Travis setups were using a legacy integration that doesn’t involve checks. Travis started using the Checks API nine months ago, but I missed the memo, and hadn’t migrated my repos. Until I moved to the Checks integration, it looked as if GitHub was just ignoring my builds.

We start by loading the event data. When GitHub Actions runs a container, it includes a JSON file with data from the event that triggered it. It passes the path to this file as the `GITHUB_EVENT_PATH` environment variable. So let’s open and load that file:

``````import json
import os

if __name__ == "__main__":
event_path = os.environ["GITHUB_EVENT_PATH"]
``````

We only want to do something if the check run is completed, otherwise we don’t have enough information to determine if we’re ready to merge. The GitHub developer docs explain what the fields on a check_run event look like, and the “status” field tells us the current state of the check:

``````import sys

if __name__ == "__main__":
...
check_run = event_data["check_run"]
name = check_run["name"]

if check_run["status"] != "completed":
print(f"*** Check run {name} has not completed")
sys.exit(78)
``````

Calling `sys.exit` means we bail out of the script, and don’t do anything else. In a GitHub Action, exit code 78 is a neutral status. It’s a way to say “we didn’t do any work”. This is what it looks like in the UI, compared to a successful run:

If we know the check has completed, we can look at how it completed. Anything except a success means something has gone wrong, and we shouldn’t merge the PR – it needs manual inspection.

``````    if check_run["conclusion"] != "success":
print(f"*** Check run {name} has not succeeded")
sys.exit(1)
``````

Here I’m dropping an explicit failure. The difference between a failure and a neutral status is that a failure blocks any further steps in the workflow, whereas a neutral result lets them carry on. Here, something has definitely gone wrong – the tests haven’t passed – so we shouldn’t continue to subsequent steps.

If the script is still running, then we know the tests have passed, so let’s put in the conditions for merging the pull request. For me, that means:

• It’s not a work-in-progress, marked by “[WIP]” in the title
• It was opened by me. I don’t want to automatically merge pull requests from random strangers. (This happened once with Mergify! My rule said “merge anything that passes tests”, somebody opened a typo fix, it passed tests… and got merged while I was asleep.)

The check_run event includes a bit of data about the pull request, including the PR number and the branches. I use this for a bit of logging:

``````    assert len(check_run["pull_requests"]) == 1
pull_request = check_run["pull_requests"][0]
pr_number = pull_request["number"]
pr_dst = pull_request["base"]["ref"]

print(f"*** Checking pull request #{pr_number}: {pr_src} ~> {pr_dst}")
``````

But for the detailed information like title and pull request author, I need to query the pull requests API. Let’s start by creating an HTTP session for working with the GitHub API:

``````import requests

def create_session(github_token):
sess = requests.Session()
"Accept": "; ".join([
"application/vnd.github.v3+json",
"application/vnd.github.antiope-preview+json",
]),
"Authorization": f"token {github_token}",
"User-Agent": f"GitHub Actions script in {__file__}"
}

def raise_for_status(resp, *args, **kwargs):
try:
resp.raise_for_status()
except Exception:
print(resp.text)
sys.exit("Error: Invalid repo, token or network issue!")

sess.hooks["response"].append(raise_for_status)
return sess
``````

This helper method creates an HTTP session that, on every request:

• Adds the Accept headers that the GitHub API wants (these seem to change frequently, so may not match the current docs)
• Checks if the API returned a 200 OK response, and throws an exception if not. This uses a requests hook, which I’ve written about in a previous post.

I have to add `pip3 install requests` to the `Dockerfile` so I can use the requests library.

Then I modify the action in my `main.workflow` to expose an API token to my running code:

``````action "Auto-merge pull requests" {
uses    = "./auto_merge_pull_requests"
secrets = ["GITHUB_TOKEN"]
}
``````

This is one of the convenient parts of GitHub Actions – it creates this API token for us at runtime, and passes it into the container. We don’t need to much around creating and rotating API tokens by hand.

We can read this environment variable to create a session:

``````    github_token = os.environ["GITHUB_TOKEN"]

sess = create_session(github_token)
``````

Now let’s read some data from the pull requests API, and run the checks:

``````    pr_data = sess.get(pull_request["url"]).json()

pr_title = pr_data["title"]
print(f"*** Title of PR is {pr_title!r}")
if pr_title.startswith("[WIP] "):
print("*** This is a WIP pull request, will not merge")
sys.exit(78)

print(f"*** This PR was opened by {pr_user}")
if pr_user != "alexwlchan":
print("*** This pull request was opened by somebody who isn't me")
sys.exit(78)
``````

If the PR isn’t ready to be merged, I use another neutral status – a failing build and a red X would look more severe than it really is.

If it’s ready and we haven’t bailed out yet, we can merge the pull request!

``````    print("*** This pull request is ready to be merged.")
merge_url = pull_request["url"] + "/merge"
sess.put(merge_url)
``````

Then to keep things tidy, I delete the PR branch when I’m done:

``````    print("*** Cleaning up pull request branch")
api_base_url = pr_data["base"]["repo"]["url"]
sess.delete(ref_url)
``````

This last step partially inspired by Jessie Frazelle’s branch cleanup action, which is one of the first actions I used, and was a useful example when writing this code.

## Putting it all together

Here’s the final version of the code:

``````# .github/main.workflow
workflow "on pull request pass, merge the branch" {
resolves = ["Auto-merge pull requests"]
on       = "check_run"
}

action "Auto-merge pull requests" {
uses    = "./auto_merge_pull_requests"
secrets = ["GITHUB_TOKEN"]
}
``````
``````# auto_merge_pull_requests/Dockerfile
FROM python:3-alpine

MAINTAINER Alex Chan <alex@alexwlchan.net>

LABEL "com.github.actions.name"="Auto-merge pull requests"
LABEL "com.github.actions.description"="Merge the pull request after the checks pass"
LABEL "com.github.actions.icon"="activity"
LABEL "com.github.actions.color"="green"

RUN pip3 install requests

COPY merge_pr.py /
ENTRYPOINT ["python3", "/merge_pr.py"]
``````
``````# auto_merge_pull_requests/merge_pr.py
#!/usr/bin/env python
# -*- encoding: utf-8

import json
import os

import requests

def create_session(github_token):
sess = requests.Session()
"Accept": "; ".join([
"application/vnd.github.v3+json",
"application/vnd.github.antiope-preview+json",
]),
"Authorization": f"token {github_token}",
"User-Agent": f"GitHub Actions script in {__file__}"
}

def raise_for_status(resp, *args, **kwargs):
try:
resp.raise_for_status()
except Exception:
print(resp.text)
sys.exit("Error: Invalid repo, token or network issue!")

sess.hooks["response"].append(raise_for_status)
return sess

if __name__ == "__main__":
event_path = os.environ["GITHUB_EVENT_PATH"]

check_run = event_data["check_run"]
name = check_run["name"]

if check_run["status"] != "completed":
print(f"*** Check run {name} has not completed")
sys.exit(78)

if check_run["conclusion"] != "success":
print(f"*** Check run {name} has not succeeded")
sys.exit(1)

assert len(check_run["pull_requests"]) == 1
pull_request = check_run["pull_requests"][0]
pr_number = pull_request["number"]
pr_dst = pull_request["base"]["ref"]

print(f"*** Checking pull request #{pr_number}: {pr_src} ~> {pr_dst}")

github_token = os.environ["GITHUB_TOKEN"]

sess = create_session(github_token)

pr_data = sess.get(pull_request["url"]).json()

pr_title = pr_data["title"]
print(f"*** Title of PR is {pr_title!r}")
if pr_title.startswith("[WIP] "):
print("*** This is a WIP PR, will not merge")
sys.exit(78)

print(f"*** This pull request was opened by {pr_user}")
if pr_user != "alexwlchan":
print("*** This pull request was opened by somebody who isn't me")
sys.exit(78)

print("*** This pull request is ready to be merged.")
merge_url = pull_request["url"] + "/merge"
sess.put(merge_url)

print("*** Cleaning up pull request branch")
api_base_url = pr_data["base"]["repo"]["url"]
sess.delete(ref_url)
``````

I keep this in a separate repo (which doesn’t have auto-merging enabled), so nobody can maliciously modify the workflow rules and get their own code merged. I’m not entirely sure what safety checks are in place to prevent workflows modifying themselves, and having an extra layer of separation makes me feel more comfortable.

## Putting it to use

If you want to use this code, you’ll need to modify the code for your own rules. Please don’t give me magic merge rights to your GitHub repos!

With this basic skeleton, there are lots of ways you could extend it. You could post comments on failing pull requests explaining how to diagnose a failure. You could request reviews if you get a pull request from an external contributor, and post a comment thanking them for their work. You could measure how long it took to run the check, to see if it’s slowed down your build times. And so on.

GitHub Actions feels like it could be really flexible and powerful, and I’m glad to have created something useful with it. I’ve had this code running in the repo for this blog for nearly a month, and it’s working fine – saving me a bit of work every time. It’ll even merge the pull request where I’ve written this blog post.

# A day out to the Forth Bridge

While clearing out some boxes recently, I found a leaflet from an old holiday. It was a fun day out, and I’d always meant to share the pictures – so here’s the story a day trip from 2016.

I was spending a week in Edinburgh, relaxing and using up some holiday allowance at the end of my last job. My grandparents had suggested I might enjoy seeing the Forth Bridge, because I tend to like railways and railway-related things. I’d heard of the bridge, but I didn’t know that much about it – so while I was nearby, I decided to go take a look.

So on a cold December morning, I caught a train from Edinburgh station, up to a village on the north end of the Forth Bridge. I’d never heard of North Queensferry, but what little Googling I’d done suggested it was the best place to go if I wanted to see the bridge up close.

Here’s a map that shows the train line from Edinburgh to the village:

The train takes about 20 minutes, and it crosses the Forth Bridge just at the end of the journey. I wasn’t really aware from the bridge as we went across – not until I got out at the station, wandered into the village, and looked back towards the track.

The name of North Queensferry hints at its former life. It’s on the north side of the narrowest point of the Firth of Forth, which makes it a natural choice if you want to cross the water by boat.

It’s said that in 1068, Saint Margaret of Scotland (wife of King Malcolm III) created the village to ensure a safe crossing point for pilgrims heading to St. Andrew’s. Whether or not she actually created the village, she was a regular user of the ferry service to travel between Dunfurmline (the then-capital of Scotland) and Edinburgh Castle.

For centuries, there were regular crossings of boats and ferries. You can still see a handful of small boats in the harbour, but I’m sure it used to be a lot busier.

Update, 21 March 2019: It turns out the lighthouse isn’t the only reminder of the ferry service! Chris, an ex-Wellcome colleague and archivist extraordinaire, found some pictures from the Britten-Pears foundation (whose archive and library he runs), including a ticket from the ferry service:

 ​ ​ ​

Another lost transport world in @BrittenOfficial’s papers: the ferry over the Forth at Queensferry, which operated until the Forth Road Bridge opened in 1964. Here are 3 passenger tickets plus what I think is a counterfoil from a ticket for a car, from 1950. pic.twitter.com/mpSWZYeEiu

Although most of the boats are gone, one part of the ferry service survives – the lighthouse! This tiny hexagonal tower is the smallest working light tower in the world. It was built in 1817 by Robert Stevenson, a Scottish civil engineer who was famous for building lighthouses. (It was a name I recognised; I loved the story of the Bell Rock Lighthouse when I was younger.)

The light tower sits on the pier, where the ferries used to dock.

Unlike many lighthouses of the time, the keeper didn’t live in the lighthouse itself – but they were still responsible for keeping the flame lit, the oil topped out, and the lighthouse maintained. At night, it would have been an invaluable guide for boats crossing the Firth.

Today, the lighthouse is open to the public. (I think this is where I picked up my leaflet.) You can climb the 24 steps, see the lamp mechanism, and look out over the water. When lit, it gave a fixed white light, with a paraffin-burning lamp – and the large half-dome was the parabolic reflector that turned the candle light into a focused beam.

I wish I’d got a few more photos of the inside of the lighthouse, but it was a pretty small space, and I was struggling to find decent angles. Either way, the lighthouse was an unexpected treat – not something I was expecting at all!

But these days, North Queensferry isn’t known for its ferry service – it’s known for the famous bridge.

In the 1850s, the Edinburgh, Leith and Granton Railway ran a “train ferry” – a boat that carried railway carriages between Granton and Burntisland. There was a desire to build a continuous railway service, and the natural next step was a bridge. After a failed attempt to build a suspension bridge in the 1870s (axed after the Tay Bridge disaster), there was a second attempt in the 1880s. It was opened in March 1890, and it’s still standing today.

Here’s a photo of its original construction, taken from the North Queensferry hills:

The Forth Bridge is a cantilever bridge. Each structure in the photo above is one of the cantilevers – a support structure fixed at one end – and in the finished bridge the load is spread between them. Spreading the load between multiple cantilevers allows you to build longer bridges, and the Forth Bridge uses this to great effect.

One of the advantages of cantilever bridges is that they don’t require any temporary support while they’re being built – once the initial structures are built, you can expand outwards and they’ll take the weight. Here’s another photo from the construction which shows off this idea:

You can see the shape of the bridge starting to expand out from the initial structure.

The Forth Bridge is famous for a couple of reasons. When it was built, it was the longest cantilever span in the world (not bested for another twenty-nine years, and still the second longest). It was also one of the first major structures in Britain to use steel – 55,000 tonnes in the girders, along with a mixture of granite and other materials in the masonry.

You get a great view of the finished bridge from inside the lighthouse:

As I wandered around the village, I got lots of other pictures of the rail bridge. These are a few of my favourites:

The last one was my favourite photo of the entire holiday, and I have a print of it on the wall of my flat. Blue skies galore, which made for a lovely day and some wonderful pictures – even if it’s not what you expect from a Scottish winter!

What’s great about wandering around the village is that you can see the bridge from all sorts of angles, not just from afar – you can get up and personal. You can see the approach viaduct towering over the houses as it approaches the village:

And you can get even closer, and walk right underneath the bridge itself. Here’s what part of the viaduct holding up the bridge looks like:

They’re enormous – judging by the stairs, it’s quite a climb up!

And you can look up through the girders, and see the thousands of beams that hold the bridge together:

It’s been standing for over a century, so I’m sure it’s quite safe – but it was still a bit disconcerting to hear a rattle as trains passed overhead!

I spent quite a while just wandering around under the bridge, looking up in awe at the structure. As you wander around the village, you never really get away from it – it always stands tall in the background. (Well, except when I popped into a café for some soup and a scone.)

After the rail bridge was built, the ferry crossings continued for many years – in part buoyed by the rise of personal cars, which couldn’t use the railway tracks. But it didn’t last – in 1964, a second bridge was built, the Forth Road Bridge – and it replaced the ferries. The day the bridge opened, the ferry service closed after eight centuries of continuous crossings.

When I visited in 2016, the Road Bridge was still open to cars. At the time, there was a second bridge under construction, but not yet open to the public – the Queensferry Crossing opened half a year later after my visit. The original Road Bridge is a public transport link (buses, cyclists, taxis and pedestrians), and the new bridge carries everything else.

Three bridges in a single day! The other bridges were on the other side of the bay, so I had to walk along the water’s edge to see them. Here’s a photo from midway along, with old and new both visible:

These are both suspension bridges, whereas the rail bridge is a cantileverl.

Like the rail bridge, you can get up and close with the base of the road bridge. Here’s an attempt at an “artsy” shot of the bridge receding into the distance, with sun poking through the base:

And another “artsy” shot with more lens flare, and both the road bridges in the shot. I love the detail of the underside on the nearer bridge.

Here’s the start of the bridge on the north side, starting to rise up over the houses:

And one more close-up shot of one of the supports:

Eventually it started getting dark, so I decided to head home. I considered walking back through North Queensferry to the station, but I decided to have a crack at crossing the road bridge instead, and catching the train from the other side. You can walk across it, although it’s nearly 2.5k long!

As you climb up to the bridge, I got some wonderful views back over the village, and in particular towards the rail bridge I’d originally come to see:

I didn’t take many photos from the bridge itself, although it’s a stunning view! It was extremely cold and windy, and I didn’t want to risk losing my camera while trying to take a photo. Here’s one of the few photos I did take, which I rather like. I took it near the midpoint, with the rail bridge set against a cloudy sky. (I’d forgotten about it until I came to write this post!)

Safely across the bridge, I weaved my way through South Queensferry, found the station, and caught a train back to Edinburgh.

I didn’t plan this trip when I decided to visit Scotland, but over two years later, I still have fond memories of the day out. If you’re ever nearby and you like looking at impressive structures, it’s worth a trip. I’m glad I have pictures, but it’s hard to capture the sheer size and scale of a bridge this large – so if you have a chance, do visit in person.

# Finding the latest screenshot in macOS Mojave

One of the things that changed in macOS Mojave was the format of screenshot filenames. On older versions of macOS, the filename would be something like:

``````Screen Shot 2016-10-10 at 18.34.18.png
``````

On Mojave, the first two words got collapsed into one:

``````Screenshot 2019-03-08 at 18.38.41.png
``````

I have a handful of scripts for doing something with screenshots – and in particular, a shortcut that grabs the newest screenshot. When I started updating to Mojave, I had to update the shell snippet that powers that shortcut. Because I couldn’t update to Mojave on every machine immediately, it had to work with both naming schemes.

This is what I’ve been using for the last few months (bound to `last_screenshot` in my shell config):

``````find ~/Desktop -name 'Screen Shot*' -print0 -o -name 'Screenshot*' -print0
| xargs -0 stat -f '%m %N'
| sort --numeric-sort --reverse
| cut -f "2-" -d " "
``````

Let’s break it down:

• The `find` command looks for files in ~/Desktop (where my screenshots get saved) that match the filename `Screen Shot*` or `Screenshot*`.

It prints the name of every matching file, separated by the null character (as set by the `-print0` flag). Because most shell languages don’t have a proper list type, just strings, the null character is a way to shoehorn a list into a string. It’s less likely to appear in one of your list elements than, say, a newline or a space.

• `xargs -0` unpacks the null character-separated string, and passes each filename to the `stat` command. This prints information about the file: `%m` is the last modified time, and `%N` is the filename.

You get a list a bit like this:

``````1551602947 /Users/alexwlchan/Desktop/Screenshot 2019-03-03 at 08.49.01.png
1552070046 /Users/alexwlchan/Desktop/Screenshot 2019-03-08 at 18.34.00.png
1552260786 /Users/alexwlchan/Desktop/Screenshot 2019-03-10 at 23.33.01.png
1552070259 /Users/alexwlchan/Desktop/Screenshot 2019-03-08 at 18.37.33.png
1552070326 /Users/alexwlchan/Desktop/Screenshot 2019-03-08 at 18.38.41.png
1552070066 /Users/alexwlchan/Desktop/Screenshot 2019-03-08 at 18.34.19.png
``````
• That entire string in turn gets passed to `sort`, which treats the strings as numeric (so `3` sorts below `10`, for example), then reverses the order. This puts the biggest number – the newest file modification date – at the top.

• The sorted list is passed to `head`, which extracts the top line (the newest file).

• Finally, `cut` separates the string on spaces (`-d " "`), then prints the second element and everything after it – throwing away the timestamp, and leaving the filename.

It’s certainly possible to do this with a higher-level language like Python or Ruby, but I like the elegance of chaining together tiny utilities like this. For non-critical code, I enjoy the brevity.