The first time I secretly stayed up past my bedtime, I got caught in under twenty minutes. I was laughing just a little too hard at larger-than-life host Jay Leno on The Tonight Show. Mind you, this wasn’t some expert sleuthing by my parents: they could have heard me from a mile away.

Children today, however, no longer have to stay up late to watch late-night television. Hosts such as Jimmy Fallon, Conan O'Brien, and Jimmy Kimmel all have YouTube channels that post the games, monologues, and interviews that you may have missed the previous night.

These clips being on YouTube bodes well for us, too. That means we can download them and surface some data to play with. 📊

Talk Show Hosts' Share of Talk

Here’s the idea: One common critique of late-night hosts is that they may talk more than their guests during interviews. Well, with a bit of deep learning magic and some everyday data science, we can check how much merit such criticism holds.

The answer? Not much. It turns out that, at their best, talk show hosts speak equally as much as their guests. Often, their guests speak more. Here are the stats per host (in alphabetical order), starting with Stephen Colbert.

Here's Conan O'Brien.

Next up, we've got the Jimmies, starting with Mr. Fallon.

And then there's Jimmy Kimmel.

And finally we have Seth Meyers.

To tie the world’s smallest bow on the data displayed in the charts above, here are some summary stats on the share of talking each late-night television host took up in their interviews. 🎙️

Host Average Share of Words Spoken by Host

Stephen Colbert 54.4%

Conan O'Brien 46%

Jimmy Fallon 51.3%

Jimmy Kimmel 45.7%

Seth Meyers 38.7%

We also found that, averaged across the 40 interviews we indexed for this project, talk show hosts spoke more than their guests 40% of the time. This follows from the golden rule of interviewing: even when the conversation is casual, try to let your guest get most of the airtime.

Wanna produce these results yourself? Let’s get to coding!

How We Calculated Late-Night Talk Time (With Code!)

Step 1: Download the YouTube Videos

In true software engineer fashion, I will be reusing some old code. Specifically, I’ll be reusing the code from this blog. Long story short, this code snippet allows us to take a list of YouTube links and download them locally using the youtube_dl package.

Vids = [URL to desired video here’, ... ]


ydl_opts = {
   'format': 'bestaudio/best',
   'postprocessors': [{
       'key': 'FFmpegExtractAudio',
       'preferredcodec': 'mp3',
       'preferredquality': '192',
   }],
   # change this to change where you download it to
   'outtmpl': './kimmel/audio/%(title)s.mp3',
}
with youtube_dl.YoutubeDL(ydl_opts) as ydl:
   ydl.download(vids)
   print()

In order to evaluate each talk show host at their best, we are going to download their most-viewed, one-on-one interviews from each of their channels. Since we’re focusing on Late Night, the hosts we’re going to be analyzing are Stephen Colbert, Jimmy Fallon, Jimmy Kimmel, Seth Meyers, and Conan O’Brian.

Step 2: Use Deepgram to Transcribe Audio

Again, we can re-use some prior code! Phew. The code snippet below takes our downloaded YouTube videos and turns them into a cute little json dump that you can then parse. 

Note that the diarize=True parameter is what separates the speakers in the video. Specifically, by setting the diarize parameter to true, we tell the AI to distinguish and enumerate each of the speakers in the video. “Speaker 0” is the first unique person to talk in the video. “Speaker 1” is the second unique person to talk in the video. And so on.

If you’d like a more in-depth breakdown of the syntax, check out our Python SDK! Or, if Python isn’t your weapon of choice, check out our other packages here.

from deepgram import Deepgram
import asyncio, json, os


dg_key = ‘Your key goes here’
dg = Deepgram(dg_key)


options = {
   "diarize": True,
   "punctuate": True,
   "paragraphs": True,
   "numerals": True,
   "model": 'general',
   "tier": ‘enhanced’
}


async def main():
   podcasts = os.listdir("./kimmel/audio")
   for podcast in podcasts:
       print("Currently processing:", podcast)
       with open(f"kimmel/audio/{podcast}", "rb") as audio:
           source = {"buffer": audio, "mimetype":'audio/mp3'}
           res = await dg.transcription.prerecorded(source, options)
           with open(f"kimmel/transcripts/{podcast[:-4]}.json", "w") as transcript:
               json.dump(res, transcript)
   return


asyncio.run(main())

Step 3: Label Each of the Speakers

While we can work with labels like “Speaker 0” and “Speaker 1,” the data we work with becomes a lot simpler to use as it becomes more human readable. So we have the following code to help us give “Speaker 0” and “Speaker 1” actual human names.

import json
import os


# create transcripts
def create_transcripts():
   print('running create_transcripts')
   for filename in os.listdir("kimmel/transcripts"):
       with open(f"kimmel/transcripts/{filename}", "r") as file:
           transcript = json.load(file)
       paragraphs = transcript["results"]["channels"][0]["alternatives"][0]["paragraphs"]
       with open(f"kimmel/pretty_scripts/{filename[:-5]}.txt", "w") as f:
           for line in paragraphs['transcript']:
               f.write(line)
'''
This function gives you the ability to label your speakers by name.
When diarizing, the Deepgram API will label the speakers as
Speaker 0, Speaker 1, Speaker 2, etc.


When this function is run, you'll one line from the transcript
for each individual speaker that the API identified during diarization.


You will then label the speaker of that line with the name that you desire.
'''
def assign_speakers():
   for filename in os.listdir("kimmel/pretty_scripts"):
       print(f"Current File: {filename}")
       with open(f"kimmel/pretty_scripts/{filename}", "r") as f:
           lines = f.readlines()
       spoken = []
       names = []
       for line in lines:
           if line.startswith("Speaker "):
               if line[0:9] in spoken:
                   continue
               print(line)
               name = input("Who is the Speaker?")
               if len(name) <= 1:
                   continue
               spoken.append(line[:9])
               names.append(name)
       print(spoken)
       print(names)
       filedata = "\n".join(lines)
       print(filedata)
       for speaker, name in zip(spoken, names):
           filedata = filedata.replace(speaker, name)
       with open(f"kimmel/pretty_scripts/{filename}", "w") as f:
           f.write(filedata)
create_transcripts()
assign_speakers()

Oh, yeah. You may be wondering, “What about the audience? What about that classic late-night show laugh track? How do we label or parse those?”

The way we’ve configured our parameters in Step 2—especially that tier = ‘enhanced’ line—leads the Deepgram model to ignore background noise such as the audience. As a result, we focus only on the words spoken by the host and guest. No pesky distractions 😉

Step 4: Run Those Numbers!

The formula looks like this: Count the number of words spoken by the host and the guest, combined. Then, calculate the percentage of those words that the host spoke: % host words = # host words / (# host words + # guest words) .

If we take that formula and place it into a block of code that parses a transcription line-by-line, it’d look something like this:

HOST = 'host:'
GUEST = 'guest:'
others = ['1:', '2:', '3:', '4:', '5:', '6:', '7:', '8:', '9:', '0:']
host_percentages = {}


#Calculate the % of words spoken only by the speaker we're currently analyzing.
def calculate_percentages():
   #Change these paths as you need.
   for filename in os.listdir("kimmel/pretty_scripts"):
       with open(f"kimmel/pretty_scripts/{filename}", "r") as file:
           host_words, guest_words = 0, 0
           current_speaker = ''
           for line in file:
               words = line.split()
               word_count = len(words) - 1
               if len(words) > 0:
                   if words[0] == HOST:
                       current_speaker = HOST
                       host_words += word_count
                   elif words[0] == GUEST:
                       current_speaker = GUEST
                       guest_words += word_count
                   else:
                       if current_speaker == HOST:
                           host_words += word_count
                       else:
                           guest_words += word_count
           host_percentage = host_words / (host_words + guest_words)
           host_percentages[filename] = host_percentage




   #Change this filename and path as you need.
   with open(f"kimmel/word_percentages.txt", 'w') as results:
       for title, word_count in host_percentages.items():
               result = title[:-4] + '@' + str(word_count)
               results.write(result)
               results.write('\n')


   average_percentage = sum(list(host_percentages.values())) / len(host_percentages)
   print('average percentage', average_percentage)


calculate_percentages()
print('percentages per video:', host_percentages)

And if we do this for every single one of the Jimmy Kimmel videos we downloaded, we can see, on average, how much of the interview consists of Jimmy Kimmel’s words.

(P.S. Jimmy Kimmel, if you’re reading this, big fan 😸)

Parting Words

If you made it this far, you get a couple bonus facts from the data 😉

  1. Bill Skarsgård spoke about 68% of the words in both his interviews with Conan O’Brien and on Jimmy Kimmel.

  2. Jennifer Lawrence spoke 32.5% of the words in her interview with Jimmy Fallon, but 75.1% of the words in her interview with Seth Meyers. Perhaps this result arose because she has a crush on Seth. (P.S. The “crush” video is the one we used during analysis!)

  3. Michelle Obama spoke around 58% of the words in her interview with Stephen Colbert, while her husband spoke only 36.8% of the words with Jimmy Kimmel.

  4. The guest who talked the most was Mark Hamill on Seth Meyers—speaking an astounding 81.04% of the words in the interview

  5. The guest who talked the least was Dwayne Johnson on Jimmy Fallon—owning only 17% of the words in that interview. He may have been busy eating his first piece of candy since 1989… 

Hope this analysis was as fun for you as it was for me! And if you’re a talk show host reading this: If you need a techie guest sometime soon, my schedule’s open 😇

dfgh1Author’s Note: This article is not meant to criticize these late night hosts. In fact, I love and watch all of these hosts frequently. Every single one of them makes me laugh. This blog is simply a quick speech-analysis project.

Author’s Note 2: Certain interviews led to some outlier percentages. For example, Jimmy Fallon has an interview with Morgan Freeman on helium. A good amount of the time, Freeman is silent—playfully angry at the helium gag. Whether this bit was planned or not, anyone’s to say. But the fact remains that Freeman was deliberately wordless most of the time. Other interviews such as Stephen Colbert’s selfie with Sarah McDaniel or Jimmy Kimmel talking Gordon Ramsay through eating Girl Scout cookies are outliers for similar reasons. These outlier interviews were not visualized.

If you have any feedback about this post, or anything else around Deepgram, we'd love to hear from you. Please let us know in our GitHub discussions .

Unlock language AI at scale with an API call.

Get conversational intelligence with transcription and understanding on the world's best speech AI platform.

Sign Up FreeBook a Demo