Tagged with “aws”


Downloading logs from Amazon CloudWatch

At work, we use Amazon CloudWatch for logging in our applications. All our logs are sent to CloudWatch, and you can browse them in the AWS Console. The web console is fine for one-off use, but if I want to do in-depth analysis of the log, nothing beats a massive log file. I’m very used to tools like grep, awk and tr, and I’m more productive using those than trying to wrangle a web interface.

So I set out to write a Python script to download all of my CloudWatch logs into a single file. The AWS SDKs give you access to CloudWatch logs, so this seems like it should be possible. There are other tools for doing this (for example, I found awslogs after I was done) — but sometimes it can be instructive to reinvent something from scratch.

In this post, I’ll explain how I wrote this script, starting from nothing and showing how I build it up. It’s also a nice chance to illustrate several libraries I use a lot (boto3, docopt and maya). If you just want the code, skip to the end of the post.

Read more →


Listing keys in an S3 bucket with Python

A lot of my recent work has involved batch processing on files stored in Amazon S3. It’s been very useful to have a list of files (or rather, keys) in the S3 bucket – for example, to get an idea of how many files there are to process, or whether they follow a particular naming scheme.

The AWS APIs (via boto3) do provide a way to get this information, but API calls are paginated and don’t expose key names directly. It’s a bit fiddly, and I don’t generally care about the details of the AWS APIs when using this list – so I wrote a wrapper function to do it for me. All the messiness of dealing with the S3 API is hidden in general use.

Since this function has been useful in lots of places, I thought it would be worth writing it up properly.

Read more →