Fully Automated Luxury Commentary

What is it?

Automatically generate commentary for football matches, using clips from Gary Bloom’s Sega Worldwide Soccer 97 and Statsbomb’s event data.

Why is it?

The idea that you can generate commentary purely from event data is relatively mundane (it has been implemented in basically every soccer video game for the past 20+ years), but has always seemed fun to me. The increasing desire to package football up as a clean entertainment product, along with the addition of synthetic crowd noise to closed-doors games brought it back to the forefront of my mind over this past Summer.

Alongside this, the various releases of Stasbomb’s free event data finally made it possible to make a widely accessible auto-commentary tool, rather than making something only for the small number of people with access to official data streams.

How does it work?

At a high level, the implementation is as follows:

  • For a given match, we can get the event data from Statsbomb as a list of events
  • We also have a set of audio clips from Sega Worldwide Soccer 97
  • Each audio clip can be linked to a set of conditions that must be true for the clip to be relevant. For example:
    • Clip X is a goal
    • Clip Y is a pass and the pass is successful and the pass ends inside the penalty area
    • (etc…)
  • For each event in the list:
    • We can check the audio clips to find any that are relevant. In other words, clips where the event in question passes all the linked conditions
    • And then pick one at random
  • Once we have the clips, we can stitch them together at the appropriate timestamp to create a single synthetic audio track

If you want a lower-level look, you can find the code on github.

There’s loads of potential improvements that could be made both to the code and to the tool, but I’ve been sitting on this for a few months, running out of motivation. I figured releasing something was better than having it languish on my computer at home.

Is it really fully automated?

Not entirely. I had to bring the clips, footage and crowd noise in a video editor to make the video at the top. Mostly, this name sounds better than “pypundit” or “AI McCoist”, which was all I had before.

Can I try it out, too?

Sure! The code is all on github here, so you can clone, run, and submit issues or improvements.