Gladia pricing: what do real-time vs async transcription cost per hour, and what’s included in the free tier?
Speech-to-Text APIs

Gladia pricing: what do real-time vs async transcription cost per hour, and what’s included in the free tier?

7 min read

Gladia’s pricing model is built to answer a simple question: how much does it really cost to power production-grade real-time and async transcription, and what can you test for free before committing? If you’re comparing engines or estimating per-hour costs for a voice product, this FAQ breaks down what’s included in Gladia’s free tier, how billing works, and what to expect when you scale.

Quick Answer: Gladia offers a free tier with up to 10 hours of transcription per month so you can test both real-time and async APIs at no cost. Paid usage is billed via transparent pay‑as‑you‑go or subscriptions; exact per‑hour prices for real-time vs async are listed on Gladia’s Pricing page and can vary by volume and plan.

Frequently Asked Questions

How much do real-time vs async transcription cost per hour?

Short Answer: Gladia charges per hour of audio processed, with transparent, published rates on its Pricing page. Real-time and async (batch) transcription share the same single API surface; effective per‑hour pricing depends on your plan, volume, and any enterprise agreement.

Expanded Explanation:
Gladia prices transcription on actual audio duration, not user seats or vague “credits.” You pay for the hours you process over REST or WebSocket, whether that’s real-time streaming from a call or async batch jobs from stored media. This keeps cost estimation straightforward for teams running voice assistants, note-takers, or contact center analytics.

Because pricing can vary by volume, commitment, and enterprise terms, Gladia doesn’t hard‑code a single “forever” rate here. Instead, it maintains an always‑up‑to‑date Pricing page where you can see per‑hour rates for core transcription, plus any add‑ons you choose (diarization, NER, summarization, etc.). If you’re doing serious volume—hundreds or thousands of hours per day—you can negotiate enterprise pricing that reflects your concurrency and workload profile.

Key Takeaways:

  • Pricing is based on audio hours processed via the API (real‑time or async).
  • Exact per‑hour rates live on the public Pricing page and can be tailored at enterprise scale.

How does Gladia’s pricing work in practice?

Short Answer: You choose between pay‑as‑you‑go and subscription billing (monthly or annual), monitor usage in your dashboard, and pay for the audio you transcribe beyond the free tier.

Expanded Explanation:
Gladia is designed to be easy to drop into your stack and just run. That extends to billing. You can start on pay‑as‑you‑go—ideal for pilots and early launches—then move to a subscription plan once your volume stabilizes. Both models use the same core concept: you’re billed on transcription usage, not on arbitrary “premium” gates.

You see how many hours you’ve used (and what features you’re calling) in your account, and you can switch plans or cancel directly from the dashboard. For teams that need predictable spend—e.g., B2B SaaS note-takers or CCaaS platforms reselling transcription—annual subscriptions are often the best fit, pairing committed volume with discounted rates.

Steps:

  1. Start on the free tier to test APIs with up to 10 hours/month at no cost.
  2. Pick a billing model: pay‑as‑you‑go for flexible usage, or subscription (monthly/annual) for predictable volume.
  3. Monitor and adjust: track hours and features used, and upgrade, downgrade, or cancel directly when your workload changes.

Is there a pricing difference between real-time and async transcription?

Short Answer: Functionally, both real-time and async share a single API and usage‑based pricing; the cost difference depends less on mode and more on your volume, add‑ons, and plan structure.

Expanded Explanation:
From an integration standpoint, Gladia treats real-time streaming and batch transcription as two faces of the same engine. You use REST for async, WebSockets for streaming, but the billing logic is the same: hours in, features toggled on, hours billed. There isn’t a separate “real-time product” with surprise premiums for low‑latency or multi‑language support.

What can change your effective per‑hour cost is:

  • Whether you’re on pure pay‑as‑you‑go or a committed subscription.
  • Which add‑ons you enable (e.g., diarization, NER, summarization).
  • How much volume you commit to in an enterprise plan.

This is intentional: you don’t have to architect around two completely different billing models for live calls vs. recorded sessions. It’s just one backbone.

Comparison Snapshot:

  • Option A: Real-time (WebSocket) – Same engine, optimized for <300 ms latency and partials in <100 ms; billed on streamed audio duration.
  • Option B: Async (REST batch) – Same engine, optimized for offline/queueable workloads; billed on file duration processed.
  • Best for: Teams that want one pricing model spanning live calls, meetings, and offline media—no separate “real-time surcharge” mental model to maintain.

What exactly is included in Gladia’s free tier?

Short Answer: The free tier gives you up to 10 hours of transcription per month, covering both real-time and async APIs, so you can test core STT and key add-ons without paying.

Expanded Explanation:
The free tier is there for serious evaluation, not just a toy demo. You can sign up, get API keys, and process up to 10 hours of audio each month—enough to wire Gladia into a test environment, a staging voice bot, or a prototype meeting assistant. You can validate accuracy on your own audio: noisy calls, overlapping speakers, mixed languages, 8 kHz telephony, the works.

Within that limit, you can exercise:

  • Real-time streaming over WebSocket.
  • Async batch over REST.
  • Multilingual transcription and automatic language detection/switching.
  • The add‑on layer (e.g., diarization, word‑level timestamps, NER and more) as documented in the API.

When you exceed 10 hours in a month, you can either move into pay‑as‑you‑go billing or talk to the team about a subscription or enterprise plan.

What You Need:

  • A Gladia account with free tier access (sign‑up on gladia.io).
  • An integration path (REST or WebSocket) wired into your app, plus a small set of audio samples representative of your real workloads.

How should I think about Gladia pricing strategically for my voice product?

Short Answer: Treat Gladia as the STT backbone: price against transcription hours that reliably power your downstream workflows—notes, summaries, CRM sync—rather than optimizing for the cheapest WER on a clean demo set.

Expanded Explanation:
Most voice platforms fail where transcription fails: wrong names, wrong numbers, mis‑attributed speakers. That’s not just an accuracy problem; it’s a trust and revenue problem when your summaries, CRM enrichment, and automations are built on top of those transcripts. Choosing an STT provider is less about “what’s the absolute cheapest per hour?” and more about “what’s the predictable cost of not dropping critical entities in real conditions?”

Gladia leans into this with:

  • An open benchmark for speech‑to‑text, evaluated on 7 datasets and 500+ hours of audio.
  • Explicit focus on noisy, real‑world conditions (accents, crosstalk, interruptions, telephony 8 kHz).
  • A single API for transcription + diarization + timestamps + NER + summarization, so you don’t stack multiple vendors and hidden integration costs.

The result: you can estimate cost per hour once and use it across your entire voice surface area—contact center, note‑taking, media indexing—without redesigning your architecture or re‑negotiating contracts for each new use case.

Why It Matters:

  • Predictable economics: One API, one pricing model, spanning real-time and async, helps you forecast margin on voice products and avoid surprise variance.
  • Workflow integrity: Paying slightly more per hour for stable, benchmarked performance can be cheaper than rebuilding broken automations after a bad transcript corrupts your customer data.

Quick Recap

Gladia prices transcription on the only metric that really matters: hours of audio processed via a single API that covers both real-time streaming and async batch. You can try it free with up to 10 hours of usage each month, enough to run realistic evaluations on your own noisy calls and multilingual meetings. Beyond that, you choose between pay‑as‑you‑go or subscription (monthly/annual), with enterprise options for high volume and multi‑product deployments. Exact per‑hour rates for real-time and async workloads sit on the public Pricing page and can be tuned via plan and add‑ons, but the model stays stable: one backbone, one billing logic.

Next Step

Get Started