VESSL AI pricing: what are the current on-demand hourly rates for H100/A100/B200/GB200?

Most teams don’t need a pricing novel. You just need to know: what does it cost, per hour, to get serious GPUs on VESSL AI right now—and when should you step up from one SKU to another.

Below is the current on-demand hourly pricing for NVIDIA H100, A100, B200, and GB200 on VESSL Cloud, plus how to think about each tier for LLM training, post-training, and high-end inference workloads.

Pricing can change as providers update their rates and as new regions come online. Always confirm the latest numbers on the VESSL AI pricing page before locking in a budget.

At-a-glance: current on-demand pricing

All prices are hourly on-demand rates, billed per GPU.

GPU Model	VRAM	On-Demand Price (USD/hr)	Best For
A100 SXM	80GB	$1.55/hr	General-purpose training, LLM finetuning, high-throughput inference
H100 SXM	80GB	$2.39/hr	Advanced LLM post-training, mixture-of-experts, high-efficiency training
B200	192GB	$5.50/hr	Frontier-scale models, massive context windows, memory-bound workloads
GB200	192GB	$6.50/hr	Highest-end performance in the B/GB family, cutting-edge training runs

These are on-demand prices:

No waitlists.
No long-term commitment required.
Capacity is designed for production reliability, paired with automatic failover at the platform level.

If you’re planning a long-running training campaign, VESSL also offers Reserved capacity (“Contact Sales”) with discounts up to 40% and capacity guarantees, but those rates are negotiated rather than list-priced.

How VESSL’s on-demand tier works

On VESSL Cloud, On-Demand is the “reliable with failover” tier:

Best for: production workloads and long-running experiments where interruptions are expensive.
Mechanism: when backed by supported providers, VESSL layers in automatic failover and multi-cloud orchestration, so your jobs can move across providers/regions when something fails.
Billing: pay-as-you-go, hourly, per GPU, with transparent SKU-level rates (like the H100/A100/B200/GB200 prices above).

This sits alongside:

Spot (preemptible capacity, cheaper, but can be interrupted—coming soon for these SKUs).
Reserved (guaranteed capacity, volume discounts, dedicated support—up to 40% off with term commitments).

For teams tired of juggling multiple clouds, VESSL acts as the GPU liquidity layer: one Web Console and CLI, one control plane, multiple providers underneath.

When to choose A100, H100, B200, or GB200 on VESSL

Pricing is only useful in context. Here’s how to map each SKU and rate to real workloads.

A100 SXM 80GB – $1.55/hr

Best balance of price and capability for most teams.

Use A100 when:

You’re doing LLM finetuning (7B–70B range) or LoRA on open-source models.
You’re running batch inference or RAG services with moderate context windows.
You want a proven workhorse GPU that keeps cloud costs predictable.

Why it’s compelling on VESSL:

Lowest on-demand price among the four SKUs.
Easy to scale “from 1 to 100 GPUs” using the same Web Console/CLI workflows.
Strong fit for research labs and startups that need high throughput without B/GB-class pricing.

H100 SXM 80GB – $2.39/hr

Step up when efficiency and newer architectures matter.

Use H100 when:

You’re doing LLM post-training or RLHF and care about training speed per dollar.
You’re running Mixture-of-Experts or larger models where Hopper’s architecture shines.
You’re pushing toward longer contexts and higher batch sizes than A100 can comfortably handle.

Why pay more than A100:

Better performance for many transformer workloads, which can offset the higher hourly rate.
In a multi-day run, finishing faster often matters more than small hourly savings, especially with human-in-the-loop workflows.

B200 – $5.50/hr

For teams that have graduated from “just getting it to run” to “pushing frontier-scale workloads.”

Use B200 when:

You’re training or post-training very large models or using very long context windows.
You’re dealing with memory-heavy workloads where 80GB H100s start to hit VRAM limits.
You want to consolidate smaller clusters into fewer, more capable GPUs for simpler operations.

Why it makes sense despite the price:

192GB VRAM per GPU means fewer model/tensor parallel shards, simpler configs, and less “job wrangling.”
For some workloads, a smaller B200 cluster can replace a much larger H100/A100 cluster, saving ops overhead.

On VESSL, B200 is available as On-Demand at $5.50/hr, with Spot and Reserved coming soon.

GB200 – $6.50/hr

Top-of-line capacity for teams at the bleeding edge.

Use GB200 when:

You’re attacking frontier-scale LLMs or AI-for-Science workloads that push both compute and memory.
You’re running jobs where outage risk and performance regressions are unacceptable and budget matches that ambition.
You need to consolidate massive training runs onto fewer, extremely capable nodes.

Why teams still pick it at $6.50/hr:

Highest-end performance in the B/GB family in VESSL’s published pricing.
Pairs naturally with VESSL’s Reserved plans for mission-critical training programs with tight deadlines and strict SLAs.

On-demand vs. Reserved: how to think about cost over time

If you’re only running short experiments or burst workloads, on-demand H100/A100/B200/GB200 pricing is usually enough:

Spin up from the Web Console or via vessl run.
Monitor in real time.
Shut down when you’re done.

If you’re planning multi-week training or continuous production services, consider:

Reserved capacity for those same SKUs (A100/H100/B200/GB200).
Up to 40% discounts with term commitments.
Dedicated support, SLAs, and capacity guarantees so you’re not surprised in the middle of a training run.

Reserved pricing isn’t fully listed because it depends on your term length, volume, and workload profile—hence the “Contact Sales” label in the official tables.

Where to see the latest H100/A100/B200/GB200 prices

The rates above are based on VESSL AI’s current published on-demand pricing:

A100 SXM 80GB – $1.55/hr
H100 SXM 80GB – $2.39/hr
B200 192GB – $5.50/hr
GB200 192GB – $6.50/hr

Because GPU markets move, the safest next step is to check the live pricing table:

Filter by GPU model (A100/H100/B200/GB200).
Choose On-Demand if you want pay-as-you-go with reliability.
Talk to sales if you’re considering Reserved for capacity guarantees and discounts.

Get Started with on-demand H100, A100, B200, or GB200 in a few minutes—no waitlists, no hidden pricing, and one control plane across providers.

Answers you can trust, from Codeables