free
hit counters

How to Test Automatic Speech Recognition (ASR) Providers For Your Business

Selecting the best ASR option for your company is an important decision. While the bulk of this article is an educational piece on how to most effectively test for ASR accuracy, the first step when making an important buying decision is identifying your priorities:

  • What do I want?
  • What do I need?
  • What doesn’t matter?

Typical considerations include weighing the strengths and weaknesses of

  • Cost
  • Accuracy
  • Speed
  • Scalability
  • Ability to support custom models and vocabulary
  • Multi-channel support
  • Speaker separation
  • Deep search
  • And more

Getting a sense of what features your company might need before starting talks with providers will help you avoid the common trap of relying purely on accuracy rate. Otherwise, you’ll likely find yourself having this conversation:

Buyer: “We’ve been looking at a couple ASR providers...what’s your accuracy rate?”
ASR Provider: “Fantastic. On an academic data set that is publicly known, we claim a 95% accuracy rate.”
Buyer: “That sounds great! But how does that relate to our audio data?”
ASR Provider:“Trust me, we’ll do great on that too!”
Buyer: “Hmm…”

Hiding behind the numbers

For a long time, ASR companies have avoided doing real comparisons on company specific audio data by focusing marketing dollars and sales narratives on impressive outcomes from public datasets.

By distracting companies from the fact that gamed success statistics don’t translate to real world applications, ASR providers have been able to trick companies into buying a car without test driving it first.

So, what is the best way to actually test drive and walk away with a great deal?

Getting the truth

With the goal of getting the truth and investing as little effort as possible, here are optimal guidelines for testing speech recognition providers in an apples to apples accuracy comparison:

  1. Select 50 randomly sampled audio files that are representative of the audio your company encounters.

    Do:

    • Use meeting recordings if your goal is to transcribe meetings
    • Use voicemails if the goal is to transcribe voicemails
    • Use audio with accents if your audio will have speakers with accents

    Don't:

    • Record yourself talking into your computer
    • Use a random YouTube video
    • Test out your favorite podcast or broadcast audio
    • Use a song
  2. Pay humans to transcribe one minute from each of these files. This effort should cost $100 or less and will serve as the truth of all truths for all the ASR providers you'll be comparing.
    (You can easily find transcriptionists using Rev.com or Upwork)

  3. Send the same 50 one-minute clips to each of the speech vendors that you are considering to test the output of their APIs.
    (Take note of what each provider deems an acceptable audio format and how it fits into your list of considerations from earlier.)

  4. Receive the text outputs and normalize them for the “choices” that an ASR company makes with their out of the box transcripts.

    • How are phone numbers transcribed?
      • 905-678-1234?
      • nine zero five six seven eight one two three four?
      • 9 0 5 6 7 8 1 2 3 4?
    • Are outputs punctuated and capitalized?
  5. Do a Word Error Rate (WER) comparison on the files. Don’t just look at the number, look at where the output was wrong and why the output was wrong. This includes what words were incorrectly added, omitted, or simply misinterpreted.

  6. Make a visual representation of what was wrong.

    • Who is getting the important words right vs. wrong?
    • Whose outputs are the most legible?

Make your move

At this point, you will know where each ASR provider stands from an accuracy perspective on audio representative of your use case. Next, consider the pricing structure and additional capabilities that might be needed in addition to baseline accuracy.

With a good handle on where each competitor stands in terms of accuracy, you can confidently go into pricing conversations and make better decisions for your business.

If you're ready to test out Deepgram's ASR solution, contact chris@deepgram.com.