Skip to main content

Detecting AV1-encoded videos with Python

In my previous post, I wrote about how I’ve saved some AV1-encoded videos that I can’t play on my iPhone. Eventually, I’ll upgrade to a new iPhone which supports AV1, but in the meantime, I want to convert all of those videos to an older codec. The problem is finding all the affected videos – I don’t want to wait until I want to watch a video before discovering it won’t play.

I already use pytest to run some checks on my media library: are all the files in the right place, is the metadata in the correct format, do I have any misspelt tags, and so on. I wanted to write a new test that would check for AV1-encoded videos, so I could find and convert them in bulk.

In this post, I’ll show you two ways to check if a video is encoded using AV1, and a test I wrote to find any such videos inside a given folder.

Table of contents

Getting the video codec with ffprobe

In my last post, I wrote an ffprobe command that prints some information about a video, including the codec. (ffprobe is a companion tool to the popular video converter FFmpeg.)

$ ffprobe -v error -select_streams v:0 \
    -show_entries stream=codec_name,profile,level,bits_per_raw_sample \
    -of default=noprint_wrappers=1 "input.mp4"
codec_name=av1
profile=Main
level=8
bits_per_raw_sample=N/A

I can tweak this command to print just the codec name:

$ ffprobe -v error -select_streams v:0 \
    -show_entries stream=codec_name \
    -of csv=print_section=0 "input.mp4"
av1

To run this command from Python, I call the check_output function from the subprocess module. This checks the command completes successfully, then returns the output as a string. I can check if the output is the string av1:

import subprocess


def is_av1_video(path: str) -> bool:
    """
    Returns True if a video is encoded with AV1, False otherwise.
    """
    output = subprocess.check_output([
        "ffprobe",
        #
        # Set the logging level
        "-loglevel", "error",
        #
        # Select the first video stream
        "-select_streams", "v:0",
        #
        # Print the codec_name (e.g. av1)
        "-show_entries", "stream=codec_name",
        #
        # Print just the value
        "-output_format", "csv=print_section=0",
        #
        # Name of the video to check
        path
    ], text=True)

    return output.strip() == "av1"

Most of this function is defining the ffprobe command, which takes quite a few flags. Whenever I embed a shell command in another program, I always replace any flags/arguments with the long versions, and explain their purpose in a comment – for example, I’ve replaced -of with -output_format. Short flags are convenient when I’m typing something by hand, but long flags are more readable when I return to this code later.

This function works, but the ffprobe command is quite long, and it requires spawning a new process for each video I want to check. Is there a faster way?

Getting the video codec with MediaInfo

While working at the Flickr Foundation, I discovered MediaInfo, another tool for analysing video files. It’s used in Data Lifeboat to get the dimensions and duration of videos.

You can run MediaInfo as a command-line program to get the video codec:

$ mediainfo --Inform="Video;%Format%" "input.mp4"
AV1

This is a simpler command than ffprobe, but I’d still be spawning a new process if I called this from subprocess.

Fortunately, MediaInfo is also available as a library, and it has a Python wrapper. You can install the wrapper with pip install pymediainfo, then we can use the functionality of MediaInfo inside our Python process:

>>> from pymediainfo import MediaInfo
>>> media_info = MediaInfo.parse("input.mp4")
>>> media_info.video_tracks[0].codec_id
'av01'

This code could throw an IndexError if there’s no video track – if it’s a .mp4 file which only has audio data – but that’s pretty unusual, and not something I’ve found in any of my videos.

I can write a new wrapper function:

from pymediainfo import MediaInfo


def is_av1_video(path: str) -> bool:
    """
    Returns True if a video is encoded with AV1, False otherwise.
    """
    media_info = MediaInfo.parse(path)

    return media_info.video_tracks[0].codec_id == "av01"

This is shorter than the ffprobe code, and faster too – testing locally, this is about 3.5× faster than spawning an ffprobe process per file.

Writing a test to find videos with the AV1 codec

Now we have a function that tells us if a given video uses AV1, we want a test that checks if there are any matching files. This is what I wrote:

import glob


def test_no_videos_are_av1():
    """
    No videos are encoded in AV1 (which doesn't play on my iPhone).

    This test can be removed when I upgrade all my devices to ones with
    hardware AV1 decoding support.

    See https://alexwlchan.net/2025/av1-on-my-iphone/
    """
    av1_videos = {
        p
        for p in glob.glob("**/*.mp4", recursive=True)
        if is_av1_video(p)
    }

    assert av1_videos == set()

It uses the glob module to find .mp4 video files anywhere in the current folder, and then filters for files which use the AV1 codec. The recursive=True argument is important, because it tells glob to search below the current directory.

I’m only looking for .mp4 files because that’s the only format I use for videos, but you might want to search for .mkv or .webm too. If I was doing that, I might drop glob and use my snippet for walking a file tree instead.

The test builds a set of all the AV1 videos, then checks that it’s empty. This means that if the test fails, I can see all the affected videos at once. If the test failed on the first AV1 video, I’d only know about one video at a time, which would slow me down.

Putting it all together

You can use ffprobe or MediaInfo – I prefer MediaInfo because it’s faster and I already have it installed, but both approaches are fine.

Here’s my final test, which uses MediaInfo to check if a video uses AV1, and scans a folder using glob. I’ve saved it as test_no_av1_videos.py:

import glob

from pymediainfo import MediaInfo


def is_av1_video(path: str) -> bool:
    """
    Returns True if a video is encoded with AV1, False otherwise.
    """
    media_info = MediaInfo.parse(path)

    return media_info.video_tracks[0].codec_id == "av01"


def test_no_videos_are_av1():
    """
    No videos are encoded in AV1 (which doesn't play on my iPhone).

    This test can be removed when I upgrade all my devices to ones with
    hardware AV1 decoding support.

    See https://alexwlchan.net/2025/av1-on-my-iphone/
    """
    av1_videos = {
        p
        for p in glob.glob("**/*.mp4", recursive=True)
        if is_av1_video(p)
    }

    assert av1_videos == set()

In one folder with 350 videos, this takes about 8 seconds to run. I could make that faster by reading the video files in parallel, or caching the results, but it’s fast enough for now.

When I buy a new device with hardware AV1 decoding, I’ll delete this test. Until then, it’s a quick and easy way to find and re-encode any videos that won’t play on my iPhone.