
VESSL AI reserved capacity: how do I request a 3-month+ commitment and estimate the discount?
Teams usually ask about Reserved capacity right at the pain point: you’ve finally found enough H100s, then a region hiccups or a quota change kills your run. Reserved on VESSL Cloud exists to remove that uncertainty: you lock in GPUs and discounts for 3+ months, and VESSL keeps that capacity alive behind a single control surface.
Below is how Reserved works, how to request a 3‑month (or longer) commitment, and how to roughly estimate your discount before you talk to sales.
Note: VESSL’s public pricing page lists base hourly rates and describes Reserved benefits. Exact discount levels depend on your GPU mix, term length, and volume, and are finalized with the sales team.
Reserved vs On-Demand vs Spot: when Reserved actually makes sense
Before you request a 3‑month+ Reserved commitment, sanity-check that you’re in the right tier:
- Spot
- Best for: large, non-urgent experiments; fault-tolerant batch jobs.
- Tradeoff: can be preempted, capacity not guaranteed.
- On-Demand
- Best for: production workloads that need reliability and automatic failover.
- Tradeoff: you pay list price, and while highly available, long-term large fleets can get expensive.
- Reserved
- Best for: mission-critical AI where you must have GPUs and want predictable cost.
- You get:
- Capacity guarantee for agreed GPU classes (e.g., A100/H100/H200/B200/GB200/B300).
- Volume discounts off public hourly rates.
- Dedicated support, with SLAs and onboarding if needed.
- Terms starting at 3 months, scaling up from there.
If you expect to run the same or growing GPU footprint most days over 3+ months, Reserved is usually the right operating mode.
What “3‑month+ Reserved capacity” actually includes
When you sign a Reserved agreement on VESSL Cloud, you’re locking in:
- Guaranteed GPU capacity
- A specific quantity by SKU, e.g.:
- 16× H100 80GB
- 32× A100 80GB
- 4× B200 or GB200 for frontier training
- Capacity is protected for your workloads for the duration of the term.
- A specific quantity by SKU, e.g.:
- Discounted hourly pricing
- Volume discounts compared to public On-Demand rates.
- The discount percentage scales with:
- Term length (3 vs 6 vs 12+ months).
- Average concurrent GPU count.
- SKU mix (A100 vs H100 vs next-gen parts).
- Dedicated support
- Priority routing to the support and infrastructure team.
- Help with cluster design, migration, and “job wrangling” reduction (so you can run more fire-and-forget jobs).
- Enterprise-ready posture
- SOC 2 Type II and ISO 27001 compliance.
- SLA language, procurement-friendly contracts, and options for on-premise or custom integrations if needed.
Step-by-step: how to request a 3‑month+ Reserved commitment
You don’t reserve capacity from the UI with a single toggle yet; a human needs to confirm availability, discount, and term.
Here’s the workflow that typically leads to a 3‑month or longer agreement:
1. Define your workload and GPU needs
Have a concrete profile ready. The more specific, the faster you get a good quote.
Capture:
- Workload type
- LLM post-training (SFT, RLHF, DPO, LoRA).
- Physical AI (embodied agents, robotics).
- AI for Science (simulation-heavy, long jobs).
- GPU SKUs and concurrency
- Which models: A100 80GB, H100 80GB, H200, B200, GB200, etc.
- Typical concurrent GPU count (e.g., “around 32 H100s in parallel”).
- Peak needs (e.g., “we spike to 64 H100s for 2–3 days each week”).
- Runtime pattern
- How many hours per day or month you expect the cluster to be busy.
- Expected growth (e.g., “we expect a 2× increase in 3–6 months”).
This is the input VESSL’s team uses to size and price a Reserved block.
2. Check current On-Demand pricing as your baseline
Head to the VESSL Cloud pricing page and:
- Identify the base hourly rate per GPU for your target SKUs.
- Note any differences across regions/providers you care about.
- Multiply by your estimated monthly GPU-hours (details in the next section) to form a baseline cost.
This baseline is what your Reserved discount will apply to.
3. Contact VESSL for Reserved capacity
To formally request a 3‑month+ Reserved commitment:
- Use one of these entry points:
- “Talk to Sales” button on vessl.ai.
- “Contact Sales” entry on the pricing page, usually under the Reserved tier.
- Provide:
- Your workload summary (from Step 1).
- Desired term length (3, 6, 12, or more months).
- Any SLA or compliance needs (uptime targets, data handling requirements).
- Preferred start date for capacity.
From here, the VESSL team will:
- Confirm that the requested GPU SKUs and quantities can be guaranteed.
- Propose:
- A discount band (e.g., “up to 25–40% depending on exact volume and term”).
- Options for multi-region coverage or automatic failover layers if needed.
4. Align on SLA, failover, and multi-cloud setup
Reserved isn’t just a discount; it’s also an operational commitment.
In the scoping call or email thread, you’ll usually nail down:
- Failover strategy
- How automatic failover interacts with your Reserved block.
- Whether you want backup capacity in another provider/region for resilience.
- Multi-Cluster design
- If you operate across multiple regions, how your GPUs appear as a unified view.
- How jobs should be scheduled for locality vs resilience.
- Support expectations
- Escalation paths and response targets.
- Onboarding help (e.g., migrating existing pipelines to
vessl runand the Web Console).
Once both sides align, you’ll receive a proposal and contract that specify:
- SKUs, quantities, regions.
- Term length and start date.
- Discounted hourly rates.
- SLA language and support scope.
How to estimate your Reserved discount before talking to sales
VESSL’s documentation notes that Reserved plans include volume discounts and terms start at 3 months. The exact discount is custom, but you can approximate whether Reserved is worth pursuing using a simple framework.
1. Calculate your monthly GPU-hours
Start from workload reality:
Monthly GPU-hours = (Concurrent GPUs) × (Avg hours per day) × (Days per month)
Example:
- 32× H100 80GB
- 16 hours per day of active training
- 30 days per month
32 × 16 × 30 = 15,360 GPU-hours/month
If you expect growth, calculate both current and anticipated footprints (e.g., 32 → 64 GPUs in 3 months).
2. Use public On-Demand rates as your baseline
From the VESSL Cloud pricing page:
- Suppose On-Demand H100 80GB is listed at $X/hour (placeholder; check the actual page).
- Your monthly On-Demand estimate:
Monthly On-Demand cost ≈ Monthly GPU-hours × $X
Using the 15,360 GPU-hours from the example:
≈ 15,360 × $X
This is your “no Reserved” scenario.
3. Apply a conservative discount band
VESSL’s Reserved documentation mentions volume discounts and terms starting at 3 months, with public messaging frequently referencing discounts up to around 40% for larger/longer commitments.
To estimate:
- For a minimal 3‑month commitment and moderate volume:
- Model a 10–20% discount as conservative.
- For higher volume (e.g., 50–100+ GPUs) or 6–12+ month terms:
- Model 20–40% as a planning range, then refine with sales.
Example ranges:
Conservative 3‑month scenario:
Effective rate ≈ 0.8–0.9 × On-Demand
Aggressive 12‑month, high-volume scenario:
Effective rate ≈ 0.6–0.8 × On-Demand
So if On-Demand H100 = $4/hour (example only):
- Conservative 20% discount:
- Reserved rate ≈
$3.20/hour
- Reserved rate ≈
- Higher 40% discount:
- Reserved rate ≈
$2.40/hour
- Reserved rate ≈
You can plug these into your GPU-hour estimate to get a band of potential savings.
4. Compare against operational risk and engineering time
Pure price is only half the story. Reserved also buys you:
- Capacity guarantees
- You don’t waste time chasing quota upgrades or hopping providers.
- Reduced “job wrangling”
- Less firefighting when capacity disappears mid-run.
- More fire-and-forget jobs with confidence they’ll stay scheduled.
- Multi-cloud resilience
- Combine Guaranteed capacity with automatic failover and Multi-Cluster for provider/region resilience.
When your training runs are long and expensive, reducing retries and outages alone can justify a Reserved commitment—even before discounts.
Common questions about 3‑month+ Reserved capacity
Can I mix multiple GPU types in a single Reserved agreement?
Yes, in practice most serious workloads mix SKUs:
- E.g., H100 or B200 for core training, A100/H200 for secondary jobs, cheaper GPUs for preprocessing or distillation.
- The VESSL team can structure your Reserved plan with a SKU basket, each with its own rate and guarantee.
What if I exceed my Reserved capacity?
When you burst above your Reserved baseline:
- You typically consume additional GPUs at standard On-Demand rates, still within the same unified control plane.
- You can later renegotiate your baseline if spikes become your new normal.
What happens after the initial 3‑month term?
As you approach the end of the initial term, you can:
- Renew with the same or larger capacity, often at improved discounts if volume increases.
- Adjust GPU mix and counts to match your evolved workloads.
- Roll back to purely On-Demand if your needs become more sporadic.
Do Reserved discounts apply to Spot?
Reserved is about guaranteed capacity and predictable pricing, so it is paired with On-Demand–style reliability, not Spot’s preemptible model. You can still use Spot for overflow or experimental work, but Reserved discounts are negotiated for your guaranteed capacity footprint.
How Reserved works alongside Auto Failover and Multi-Cluster
One of VESSL Cloud’s core value props is turning fragmented, multi-cloud GPU pools into one control surface with automatic failover. Reserved capacity plugs into that:
- Auto Failover
- If a provider or region experiences issues, your workloads can be automatically rescheduled to healthy capacity within your guaranteed pool and configured backup options.
- Multi-Cluster
- You see a unified view across regions, even when your Reserved GPUs are physically distributed.
- You can keep data close to compute using Cluster Storage and Object Storage, while still falling back to other regions when needed.
For mission-critical workloads—continuous LLM serving, production post-training loops, or government/enterprise deployments—this model ensures your Reserved GPUs aren’t a single-region single-point-of-failure.
When a 3‑month+ Reserved commitment is clearly worth it
You should strongly consider Reserved if:
- You’re running steady, high-utilization training:
- 16+ GPUs most days, for models that take weeks to train.
- Your workloads are production or mission-critical:
- Downtime means missed launch dates, SLA penalties, or lab schedule disruptions.
- You’re hitting quota issues, waitlists, or flaky capacity in single clouds.
- You want budget predictability:
- Finance wants a clear cost per GPU-hour and no surprises.
In that world, a 3‑month+ Reserved block on VESSL Cloud lets you:
- Lock in the GPUs.
- Lock in the discount.
- Rely on multi-cloud failover and unified monitoring instead of babysitting each provider.
Next step
If you already know roughly how many A100/H100/H200/B200/GB200/B300 GPUs you need and for how long, your next move is to validate the numbers with the team.
Get Started to explore On-Demand pricing, then hit “Talk to Sales” to request a 3‑month+ Reserved capacity proposal, share your GPU-hour estimates, and get a concrete discount range for your workloads.