SambaNova SambaRack SN50: how do I request a quote and what facilities info (power/cooling) do you need?
AI Inference Acceleration

SambaNova SambaRack SN50: how do I request a quote and what facilities info (power/cooling) do you need?

10 min read

SambaRack SN50 is built for fast, large-scale agentic inference, so the buying process usually starts with two things in parallel: a technical scoping conversation and a facilities check to confirm your data center can support the required power and cooling. You can request a quote in a few clicks, but having basic rack, power, and cooling information ready will dramatically speed up sizing and pricing.

Quick Answer: To request a SambaRack SN50 quote, contact SambaNova through the “Get Started” / contact form and indicate you’re interested in SambaRack SN50 for inference workloads. A solutions team will follow up to validate your use case and facility readiness—expect to provide data on rack space, power delivery (circuits/voltage), and cooling capacity so they can right-size the configuration and deployment plan.

The Quick Overview

  • What It Is: SambaRack SN50 is SambaNova’s fifth-generation, rack-level inference system built around SN50 RDUs and SambaStack, optimized for fast, efficient agentic inference on the largest models.
  • Who It Is For: Infrastructure and platform teams running production LLM and agent workflows—especially where power, cooling, and cost per token are first-order constraints.
  • Core Problem Solved: It eliminates the “one model per node” bottleneck by allowing multiple frontier-scale models and agentic workflows to run efficiently on a single, integrated inference stack, while staying within data center power and cooling envelopes.

How It Works

From a buyer’s perspective, the SambaRack SN50 process has three stages: contact and scoping, technical and facilities qualification, then a tailored quote and deployment plan. Under the hood, the system itself combines SN50 RDUs, SambaStack, and SambaOrchestrator in a rack-ready form factor that’s designed for high-throughput, low-power inference.

  1. Initial Contact & Use-Case Scoping:
    You reach out via the SambaNova contact flow, specify interest in SambaRack SN50, and outline your workloads (models, concurrency, latency targets, data residency). SambaNova maps this to an initial configuration concept—how many racks, which models, and what kind of agentic workflows you’re targeting.

  2. Facilities & Architecture Qualification:
    A solutions architect works with your infra team to validate power, cooling, and rack constraints in your data center. In parallel, they refine the architecture: which models to host (e.g., gpt-oss-120b, DeepSeek-R1, Llama), what degree of model bundling you need, and how SambaOrchestrator will integrate with your existing monitoring and autoscaling.

  3. Quote, Sizing & Deployment Plan:
    With workload and facilities constraints clear, SambaNova prepares a quote for SambaRack SN50 systems plus the associated software stack (SambaStack, SambaOrchestrator) and support. The proposal includes configuration details, expected performance (tokens/sec), and deployment guidance aligned to your power and cooling limits.

Features & Benefits Breakdown

Core FeatureWhat It DoesPrimary Benefit
SN50 RDUs with Three-Tier MemoryUses custom dataflow processing and a tiered memory architecture to keep models and prompts “hot” with minimal data movement.High tokens-per-watt and low latency on frontier-scale models, even for complex agent loops.
Model Bundling on SambaStackHosts and switches between multiple large models (e.g., gpt-oss-120b, DeepSeek) on a single node.Eliminates “one-model-per-node” workflows and reduces infrastructure sprawl for multi-model agents.
Rack-Ready, Data Center–Optimized DesignShips as a fully integrated SambaRack SN50 system that drops into standards-compliant racks.Faster time-to-production with known power/cooling profiles and a system tuned for inference.

Ideal Use Cases

  • Best for agentic inference at frontier scale: Because SambaRack SN50 is optimized for fast agentic inference at a fraction of the cost on the largest models like gpt-oss-120b and DeepSeek, measured at high throughput (e.g., up to 200 tokens/sec on DeepSeek-R1, as noted by independent benchmarks).
  • Best for multi-model production serving: Because SambaStack and SN50 RDUs support model bundling and efficient switching between models, you can consolidate what would have been many one-model GPU nodes into fewer, higher-utilization inference racks.

How to Request a SambaRack SN50 Quote

You don’t need a fully finalized architecture to start the quote process, but it helps to come in with your workloads and basic facilities data. Here’s the streamlined path:

  1. Go to the SambaNova contact page
    Use the “Get Started” path via:
    Get Started

  2. Specify SambaRack SN50 and your workload
    In the form:

    • Indicate you’re interested in SambaRack SN50 for on-prem or co-lo deployment.
    • Briefly describe your target workloads:
      • Models (e.g., Llama, DeepSeek-R1, gpt-oss-120b, internal checkpoints).
      • Expected concurrency and QPS.
      • Latency or SLA targets (e.g., agentic loops under X seconds).
      • Data residency / sovereign AI constraints, if any.
  3. Provide basic facilities context
    You don’t need exact CFD modeling, but your infra team should be able to share:

    • Whether this is an enterprise DC, co-lo, or sovereign facility.
    • Available rack positions and power per rack.
    • Cooling type (hot/cold aisle, liquid vs. air).
  4. Engage in a scoping call
    SambaNova will schedule a technical call to:

    • Validate workloads and model mix.
    • Discuss on-prem vs. hosted or hybrid (e.g., combining SambaRack on-site with SambaCloud).
    • Walk through power, cooling, and network integration.
  5. Receive a tailored quote and deployment plan
    Based on your input, you’ll receive:

    • A proposed number of SambaRack SN50 systems and associated licensing/support.
    • Estimated performance for your workloads.
    • Facilities requirements mapped to your environment (circuits, BTU/hr or kW cooling per rack, etc.).

Facilities Information SambaNova Typically Needs (Power & Cooling)

The SN50 line is built for fast agentic inference on large models, which means it’s dense and high performance. SambaNova will align its recommendations to your reality, but this is the type of information they will ask for to finalize a quote and deployment plan.

1. Rack & Space Details

  • Available rack units (U):
    How many contiguous U you can dedicate per SambaRack SN50.
  • Rack depth and weight limits:
    Confirmation that your racks and floors support the physical dimensions and weight profile of a fully populated inference system.
  • Rack count and growth plan:
    Whether you are planning for a single rack, a small cluster, or a multi-rack footprint, and what 12–24 month expansion could look like.

2. Power Requirements

SambaNova systems are optimized for inference efficiency, but you still need enough dedicated power to feed frontier-scale models at high utilization.

Expect to discuss:

  • Power per rack (kW):
    • Maximum deliverable kW per rack in your DC.
    • Typical/average draw assumptions you’re comfortable operating at.
  • Circuit configuration:
    • Number of circuits per rack and their capacity (e.g., 2× 30A or 60A feeds).
    • Redundancy model (A/B feeds, UPS-backed circuits, etc.).
  • Voltage and phase:
    • Standard data center voltages supported (e.g., 208V or 230/240V, single or three-phase).
    • Any regional constraints or special PDU requirements.
  • Power budget policy:
    • Whether you size off nameplate or typical draw.
    • Any hard caps you must stay under per rack or per row.

For reference and comparison when scoping:

  • SambaRack SN40L-16—the prior generation system—is optimized for low power inference with an average of ~10 kWh while running many models simultaneously.
  • SambaRack SN50 is optimized for fast agentic inference at a fraction of the cost on the largest models; it is tuned for performance / tokens-per-watt at frontier scale rather than minimal power alone. SambaNova will translate that into concrete kW guidance once it knows your configuration and utilization targets.

3. Cooling Requirements

Cooling is where dense, high-throughput inference can run into trouble if it isn’t planned up front.

Be ready to provide:

  • Cooling capacity per rack (kW or BTU/hr):
    What your facility can sustain for continuous loads, not just peaks.
  • Cooling topology:
    • Hot/cold aisle containment, in-row cooling, or overhead/underfloor distribution.
    • Air-only vs. any liquid-assisted or rear-door heat exchangers.
  • Temperature and humidity ranges:
    Data center environmental envelopes and any constraints on inlet temperature.
  • Density policies:
    Maximum target kW/rack your operations team is comfortable with and whether high-density zones exist for AI workloads.

SambaNova will map expected thermal output of the proposed SambaRack SN50 configuration into your environment, and may recommend:

  • Concentrating SN50 into designated high-density rows.
  • Adjusting containment or adding in-row capacity.
  • Staggering deployment phases to validate thermal behavior under load.

4. Network & Integration Considerations

While not directly power/cooling, you’ll typically be asked for:

  • Available network fabric:
    Bandwidth and topology between SambaRack SN50, your app tiers, and storage.
  • Latency expectations:
    Round-trip latency budgets for inference calls from your services to the rack.
  • Security zones:
    Whether SambaRack lives in a separate security segment, and how that affects control-plane access (for SambaOrchestrator) and metrics export.

Limitations & Considerations

  • Facility constraints may cap configuration density:
    If your data center has tight kW-per-rack or cooling limits, SambaNova may propose fewer SN50 systems per rack, more racks, or a hybrid approach with SambaCloud. Early, detailed facilities data helps avoid redesign later.
  • Model and workload clarity drives accurate sizing:
    Vague “we might run some LLMs” requirements make it harder to quote precisely. Providing concrete models, context lengths, and concurrency targets yields a tighter performance and cost estimate.

Pricing & Plans

SambaRack SN50 pricing is tailored to your configuration, workloads, and deployment model. It is not a one-size SKU; cost reflects:

  • Number of racks and SN50 RDUs.
  • Software stack and orchestration (SambaStack, SambaOrchestrator).
  • Support level and any integration or migration assistance.
  • Whether you pair on-prem SambaRack with SambaCloud usage for burst or additional multi-region coverage.

To initiate pricing:

  • Use the Get Started path and request SambaRack SN50 for on-prem / co-lo deployment.
  • Expect SambaNova to align the proposal to your tokens-per-second, tokens-per-watt, and TCO targets, often contrasting against GPU-based baselines.

Examples:

  • Dedicated SambaRack SN50 deployment: Best for enterprises and sovereign AI teams needing full on-prem control, high throughput on frontier models, and tight integration with existing data center operations.
  • Hybrid SambaRack SN50 + SambaCloud: Best for teams needing on-prem capacity (compliance, data gravity) plus flexible cloud-based burst or additional region coverage, all via OpenAI-compatible APIs.

Frequently Asked Questions

How do I start the SambaRack SN50 quote process if I only know my models, not my exact power budget?

Short Answer: Start with your models and concurrency assumptions; SambaNova will help back-solve the power and cooling plan, but you should still loop in your facilities team early.

Details:
If you know you want to run, for example, DeepSeek-R1 and gpt-oss-120b at specific QPS and latency targets, that’s enough to begin scoping. On the initial call, SambaNova will translate these into expected tokens/sec and utilization patterns, then estimate the density you’d need per rack. Your facilities team can then confirm whether your current power and cooling can support that density or whether you should adjust configuration (fewer systems per rack, more racks, or phased deployment). The outcome is a quote that fits both your workload and your physical constraints.

What if my data center can’t support very high power density per rack?

Short Answer: SambaNova can design a lower-density SambaRack SN50 footprint, use more racks, or combine on-prem with SambaCloud to respect your limits.

Details:
Not every facility is built for high-density AI clusters. If your per-rack kW ceiling is modest, SambaNova can:

  • Spread SN50 systems across more racks at lower per-rack draw.
  • Use configuration options that align with your power policies.
  • Recommend a hybrid architecture where core, latency-sensitive inference runs on-prem on SambaRack SN50, while overflow or non-critical workloads run on SambaCloud.

This still gives you the benefits of the SN50 architecture—dataflow processing, three-tier memory, model bundling—without overloading your power and cooling envelope.

Summary

Requesting a SambaRack SN50 quote is straightforward: contact SambaNova, state your interest in SN50, and share your workloads plus basic rack, power, and cooling information. From there, SambaNova’s team will design a configuration that leverages SN50 RDUs, SambaStack, and SambaOrchestrator to deliver high-throughput, low-power agentic inference on frontier-scale models—while staying within your data center’s physical constraints. The more precise your facilities and workload data, the more accurate and actionable your quote and deployment plan will be.

Next Step

Get Started