Podcast Stats
  • Overview
  • The Incomparable
    • The Motshership
    • The Network
  • Relay FM
  • ATP
  • Data

On this page

  • Current data
  • All Shows
    • Timeline
    • Duration

Podcast Data for Fun and (not) Profit

Author

Lukas Burk

Updated

May 4, 6:23 (UTC)

This site is an automatically generated collection of plots and tables visualizing some metadata from my favorite podcasts/podcast networks.

Included:

  • The Incomparable
  • Accidental Tech Podcast (ATP)
  • Relay FM

Data includes episode runtimes, release dates, and hosts (in case of The Incomparable, also guests).

The data sources are:

  • The Incomparable: Home-brewed stats.txt files Jason has set up for me to parse (thanks Jason <3) and some web-scraping of the individual show’s archive pages.
  • Relay.fm: The individual show RSS feeds and some mild web-scraping.
  • ATP: The website

Data collection and site/plot-recreation happens Sunday morning (UTC).

Note

This project is in a perpetual cycle of maintenance and “ain’t nobody got time for that”-ness, which is primarily reflected in many of the plots not looking right.
The reason is that whenever I tweaked the plot dimensions so that the spacing allows for all the text to be readable, it only takes a few months for multiple new podcasts to appear.
Since the data collection and plot creation is automated and updated daily, this means the plots now have to fit more data, but since I haven’t figured out a good way to auto-adjust the plot width accordingly, plots will look squished after a few months.

Current data

The last time the data was updated and the date of the latest episode within each dataset.

Code
tibble::tribble(
  ~source, ~file_time, ~latest_episode,
  "relay.fm", file.info("data/relay_episodes.rds")[["mtime"]], max(relay_episodes$date),
  "The Incomparable", file.info("data/incomparable_episodes.rds")[["mtime"]], max(incomparable_episodes$date),
  "ATP", file.info("data/atp.rds")[["mtime"]], max(atp$date)
) |>
  arrange(desc(latest_episode)) |>
  knitr::kable(
    col.names = c("Podcast/Network", "File Time", "Latest Episode")
  ) |>
  kableExtra::kable_styling()
Podcast/Network File Time Latest Episode
relay.fm 2025-05-02 06:54:22 2025-05-01
The Incomparable 2025-05-02 06:54:22 2025-05-01
ATP 2025-05-02 06:54:22 2025-05-01

All Shows

Here’s a quick overview of the podcast data I’ve collected so far.

Code
podcasts |>
  group_by(network) |>
  summarize(
    episodes = n(),
    firstep = as.character(as.Date(min(date, na.rm = TRUE))),
    lastep = as.character(as.Date(max(date, na.rm = TRUE))),
    shows = length(unique(show)),
    epspershow = round(episodes / shows, 1),
    .groups = "drop"
  ) |>
  set_names(c(
    "Network", "Episodes", "First Episode",
    "Last Episode", "# of Shows", "Avg Episodes per Show"
  )) |>
  reactable(
    rownames = FALSE 
  )

Timeline

From first to last published episode.

Code
# column: screen-inset-shaded
podcasts |>
  group_by(show) |>
  mutate(
    d_min = min(date),
    d_max = max(date)
  ) |>
  ggplot(aes(x = reorder(show, d_min), color = network)) +
  geom_errorbar(aes(ymin = d_min, ymax = d_max), width = 1, linewidth = 1) +
  scale_x_discrete(position = "top") +
  scale_color_manual(values = network_colors) +
  coord_flip() +
  theme(legend.position = "top") +
  labs(
    title = "All Podcasts Timeline",
    subtitle = "The Incomparable, Relay FM, ATP",
    x = "", y = "First to Latest Episode Date",
    color = NULL,
    caption = caption
  )

Duration

Code
# column: screen-inset-shaded

podcasts |>
  filter(!is.na(duration)) |>
  ggplot(aes(x = reorder(show, duration, FUN = mean), y = duration, fill = network)) +
  geom_boxplot() +
  coord_flip() +
  scale_fill_manual(values = network_colors) +
  labs(
    title = "Podcasts by Duration", subtitle = "By Network",
    x = "", y = "Duration (H:M:S)", fill = NULL,
    caption = caption
  )

 
  • About