Skip to main content

How to tally the attributes of the top N rows in a SQLite table

Use a WITH clause to do a nested query for the top N rows in the table, then do a tally over that result.

There’s a photos table in the Flickr Commons Explorer database that has two columns: owner_id and count_views. I wanted to get the 10,000 photos with the most views, and then see who owned each of those photos.

Getting the 10,000 photos with the most views is a pretty simple query:

SELECT *
FROM photos
ORDER BY count_views DESC
LIMIT 10000;

I thought maybe I could just add a GROUP BY clause to this query, something like this:

-- This doesn't work!
SELECT owner_id, COUNT(owner_id)
FROM photos
GROUP BY owner_id
ORDER BY count_views DESC
LIMIT 10000;

but this query actually counts the number of photos from each member.

I did a bit of reading, and I need to run the first query inside a WITH clause, and then do the tally over the result:

WITH most_viewed_photos AS (
    SELECT owner_id
    FROM photos
    ORDER BY count_views DESC
    LIMIT 10000
)
SELECT owner_id, COUNT(*) as photos_in_top_10k
FROM most_viewed_photos
GROUP BY owner_id
ORDER BY photos_in_top_10k DESC;

This query gives me what I want!

Then I added the {real,user}name value from the members table, which makes it a bit easier to read the result:

WITH most_viewed_photos AS (
    SELECT owner_id
    FROM photos
    ORDER BY count_views DESC
    LIMIT 10000
)
SELECT
    COALESCE(m.realname, m.username) as member_name,
    COUNT(*) as photos_in_top_10k
FROM most_viewed_photos p
JOIN members m ON p.owner_id = m.id
GROUP BY member_name
ORDER BY photos_in_top_10k DESC;

After I wrote this, I was reminded I did some very similar SQLite tallying last year, where I did a nested query and tallied over the results.