Swapping gems for tiles
On Sunday evening, I quietly swapped out a key tool that I use to write this site. It’s a big deal for me, but hopefully nobody else noticed.
The tool I changed was my static site generator. I write blog posts in text files using Markdown, and then my static site generator converts those text files into HTML pages. I upload those HTML pages to my web server, and they become available as my website.
I’ve been using a Ruby-based static site generator called Jekyll since late 2017, and I’ve replaced it with a Python-based static site generator called Mosaic. It’s a new tool I wrote specifically to build this website, so I know exactly how it works. I’m getting rid of a Ruby tool I only half-understand, in favour of a Python tool I understand well.
Nothing is changing for readers (yet). I tried hard to avoid breaking anything – URLs haven’t changed, pictures look identical, the RSS feed should be the same as before. Please let me know if you spot something broken!
You’ll see more changes soon, because I have lots of ideas to try this year. I want to make this website into more of a “digital garden”, getting even further away from a single list of chronologically ordered posts. I don’t want to build that with Jekyll – or to be precise, I don’t want to build it with Ruby.
It’s not Ru(by), it’s me
I don’t want to sound dismissive of Jekyll. It’s an impressive project that powers thousands of sites, and I used it happily for over eight years. I pushed it to build a lot of custom and bespoke pages, and it handled it with ease.
Jekyll’s superpower is its theming and plugin system, which allow you to customise its behaviour. Want something that Jekyll can’t do out of the box? Create your own template or plugin. But those plugins have to be written in Ruby, the same language as Jekyll itself – and I only write Ruby to make blog plugins. I can do it, but I’m slow, I’m unsure, and writing Ruby has never felt familiar.
You can build a digital garden with Jekyll and Ruby – plenty of people already have – but I know I’d find it a difficult and frustrating experience. My lack of Ruby experience would slow me down.
While my Ruby knowledge has sat still, I’ve become a much better Python programmer. Since I set up Jekyll in 2017, I’ve worked on big Python projects with extensive tests, thorough data validation, and an explicit goal of longevity. I tried writing a Python static site generator in 2016 and I got stuck; a decade later and I’m ready for another attempt.
This isn’t just general Python expertise – I’ve written about how I’m using static websites for tiny archives, and all the surrounding tools are written in Python. Porting this website to Python means I can reuse a lot of that code.
I hacked together an experimental Python static site generator over Christmas, and I wrote it properly over the last few weeks. I named it “Mosaic” after the square-filled headers on every page, and I really like it. I already feel faster when I’m working on the site, writing a language I know properly.
How does Mosaic work?
Mosaic works like other static site generators: it reads a folder full of Markdown files, converts them to HTML, and writes the HTML into a new folder. And just like Jekyll and similar tools, I’m building on powerful open-source libraries.
Here’s a comparison of the key dependencies:
| Purpose | Jekyll | Mosaic |
|---|---|---|
| Templates | Liquid | Jinja |
| Markdown rendering | kramdown | Mistune |
| Image generation | ruby-vips | Pillow |
| Syntax highlighting | Rouge | Pygments |
| Data validation | json-schema | Pydantic |
| HTML linting | HTMLProofer | ??? |
Here are some thoughts on each.
Templates with Jinja
Jinja is the templating engine used by Flask, a framework I’ve used to build dozens of small web apps, so I was very familiar with the basic syntax. It’s similar to Liquid – both use {% … %} for operators and {{ … }} to insert values – so I could reuse my templates with only small changes.
The tricky part was replicating my custom tags, which I’d previously implemented using Jekyll plugins. I had to write my own Jinja extensions, which are harder than writing Jekyll tags. In Jinja, I have to interact directly with the lexer and parser, whereas a Jekyll plugin is a simple render function.
Markdown with Mistune
Mistune is a Markdown library I discovered while working on this project. I used Python-Markdown previously, but Mistune is faster and easier to extend. In particular, it provides a friendly way to customise the HTML output by overriding named methods. For example, I can add an id attribute to my headings by overriding the header(text, level) method.
The tricky part about changing Markdown renderer is all the subtle differences in the places where Markdown isn’t defined clearly. Mistune and kramdown return the same output in 95% of cases, but there’s a lot of variation and broken HTML in the remaining 5%.
One particular difficulty was all my inline HTML. This is one of my favourite Markdown features – you can include arbitrary HTML and it gets passed through as-is – and I make heavy use of it in this blog. But kramdown and Mistune disagree about where inline HTML starts and ends, and Mistune was wrapping <p> tags around HTML that kramdown left unchanged. I had to adjust my templates and whitespace to help Mistune distinguish Markdown and HTML.
Image generation with Pillow
I generate multiple sizes and formats for every image, so they get served in a fast and efficient way. I use Pillow to generate each of those derivatives.
Pillow is easier to install and supports a wider range of image formats than any of the Ruby gems I tried; it’s a highlight of the Python ecosystem.
The picture handling code has always been the thorniest bit of the website, and I hope that building it atop a nicer library will give me the space to simplify that code.
Syntax highlighting with Pygments
Rouge and Pygments are both capable libraries, and they return compatible HTML which made it easy to switch – I could reuse my CSS and my syntax highlighting tweaks.
I think Pygments theoretically supports highlighting a wider variety of languages, but I never found Rouge lacking so it’s not a meaningful improvement.
Data validation with Pydantic
Every Markdown file in my site has YAML “front matter” for storing metadata, for example:
---
layout: post
title: Swapping gems for tiles
---Jekyll treats this as arbitrary data and doesn’t do any validation on it, which made it harder to change and keep consistent as the site evolved. I built a rudimentary validation layer using json-schema, but it was always an add-on.
In Mosaic, this front matter is parsed straight into a Pydantic model, so it’s type-checked throughout my code. This means I can write stricter validation checks, and catch more issues and inconsistencies before they break the website.
Linting HTML with HTML-Proofer
I’ve been using the HTMLProofer gem to check my HTML since 2019. It checks my HTML for errors like broken links or missing images, so I’m less likely to publish a broken page. It’s caught so many mistakes.
There’s no obvious Python equivalent, so for now I’m still running it as a separate step after I generate my HTML. It has a much lower overhead than running Jekyll so I’m not in a hurry to remove it – although eventually I’d like to reimplement the checks I care about with BeautifulSoup, so I can fully expunge Ruby.
I’m also considering using Playwright for some static site testing, but that’s a larger piece of work.
It’s not named after a museum in Georgia
The name isn’t so important, because I’m the only person who will ever use this tool – but I discovered a fun nugget that’s too juicy not to share.
I named my tool “Mosaic” after the tiled headers that appear at the top of every page. Those headers are a design element I added in 2016, and I’m so fond of them now I can’t imagine getting rid of them. I later remembered that Mosaic is also the name of a discontinued web browser, and I like the “old web” vibes of that name. One of the best compliments I’ve ever received about this site was “it looks like something from the 1990s” – fast, clean, and not junked up with ads.
One of the bizarre things I discovered while writing this post is that it’s not the first time the names “Mosaic” and “Jekyll” have appeared alongside each other.
There’s a small historical island off the coast of Georgia (the USA one) called Jekyll Island. It includes bike trails, golf courses, a beach that’s been in several films… and a history museum called Mosaic. What are the chances?
I know nothing about Jekyll Island or the history of Georgia, but if I ever feel safe enough to return to the US, I’d love to visit.
Growing the garden
I’ve been using Mosaic for several weeks and I’m really enjoying it. I wouldn’t recommend using it for anything else – it’s only designed to build this exact site – but all the source code is public, if you’d like to read it and understand how it works.
Switching to Mosaic has allowed me to start working on three improvements to the site:
Replace my “today I learned” (TIL) posts with “notes”. I really like how the TIL section has allowed me to write more frequent, smaller posts, but they’re still point-in-time snapshots. I want to replace them with notes that aren’t tied to a particular date, and instead can be living documents I update as I learn more.
Make the list of topics more useful. My current tags page is a wall of text, a list of 241 keywords with minimal context or explanation. Nobody is wading through that to find something interesting – I want to add some hierarchy to make it easier to read, and give a better overview of the site.
Fold my book reviews into my main site. My book reviews currently live on a separate site, which is only half-maintained. I’d like to merge them into the main site, let them benefit from the design improvements here, and start writing reviews of other entertainment.
I’ve had these ideas for months, and I’m excited to finally ship them, and bring this site closer to my idea of a “digital garden”