How do I contact together.ai sales or schedule a call for enterprise pricing, dedicated capacity planning, and procurement?
Foundation Model Platforms

How do I contact together.ai sales or schedule a call for enterprise pricing, dedicated capacity planning, and procurement?

7 min read

Most teams reach out to together.ai sales when they’re ready to move from experimentation to production—usually because they need enterprise pricing, guaranteed capacity, or help sizing Dedicated Inference and GPU Clusters. You can do all of that through the Together “Contact Sales” flow in a few minutes.

Quick Answer: To contact together.ai sales or schedule a call, submit the Contact Sales form on the Together website. From there, the sales and solutions engineering team will follow up to discuss enterprise pricing, dedicated capacity planning, procurement, and any custom models or infrastructure you need.


The Quick Overview

  • What It Is: A direct line to together.ai’s sales and solutions experts to design an enterprise plan, price out dedicated capacity, and align procurement with your AI roadmap.
  • Who It Is For: Organizations moving beyond casual experimentation—teams with production SLOs, security requirements, or large/steady workloads across inference, fine-tuning, and GPU clusters.
  • Core Problem Solved: You get clear pricing and architecture guidance for Dedicated Model Inference, Dedicated Container Inference, Serverless Inference, Batch Inference, GPU Clusters, and Model Shaping, without guessing at capacity or cost.

How It Works

At a high level, contacting together.ai sales is a structured intake process designed to map your workloads to the right AI Native Cloud configuration—serverless vs dedicated vs clusters—and then align that to enterprise pricing and procurement.

  1. Submit the Contact Sales Form:

    • Go to: https://www.together.ai/contact-sales
    • Provide basic details:
      • First name, last name
      • Company email
      • Company location (North America, EMEA, APAC, LATAM)
      • Which products you’re interested in (Inference, Fine-Tuning, GPU Clusters, Code Sandbox, or Other)
      • Stage of your AI project (Just experimenting → Already in production)
      • Any extra implementation details or requirements
    • This routes your request to the right sales and solutions engineering pod.
  2. Discovery & Capacity Planning Call:

    • A Together expert will contact you (typically email first) to:
      • Clarify workloads: models, modalities (text, image, video, code, voice), context lengths, and latency targets
      • Choose deployment modes: Serverless Inference vs Dedicated Model Inference vs Dedicated Container Inference vs GPU Clusters
      • Size capacity: number of GPUs, batch sizes, traffic patterns (steady vs bursty), batch vs real-time
    • This is where they’ll walk through:
      • Kernel/runtime optimizations (Together Kernel Collection, ATLAS, CPD)
      • How to hit your time-to-first-token and tokens/sec SLOs
      • Data and model ownership, isolation, and compliance (SOC 2 Type II, encryption in transit/at rest, tenant-level isolation)
  3. Custom Plan, Pricing, and Procurement Alignment:

    • Based on the discovery, you’ll receive:
      • An enterprise quote with SKU-level pricing for:
        • Serverless and Batch Inference
        • Dedicated Model or Container Inference
        • GPU Clusters (self-serve vs 1,000+ GPU bespoke infrastructure)
        • Fine-tuning / Model Shaping services
      • Recommendations for:
        • Region/availability zone (25+ cities across USA and Europe)
        • Commit levels vs no-commit options
        • How to split workloads (e.g., steady traffic on dedicated endpoints; bursty workloads on serverless)
    • From there, your procurement team can work through:
      • MSAs and security review
      • PO / invoicing setup
      • Trial or pilot phases if needed

Features & Benefits Breakdown

Core FeatureWhat It DoesPrimary Benefit
Contact Sales Intake FormCaptures your company details, products of interest, and project stage.Routes you to the right experts quickly with minimal back-and-forth.
Enterprise Architecture SessionMaps your workloads to Serverless, Dedicated Inference, or GPU Clusters.Ensures you get the best price-performance for your specific traffic.
Custom Pricing & Capacity PlanProduces a tailored quote for inference, fine-tuning, and clusters.Gives predictable costs, clear SLOs, and procurement-ready numbers.

Ideal Use Cases

  • Best for enterprise pricing and volume commitments: Because you can align long-term usage—like steady production inference or large GPU Cluster reservations—with discounted pricing and clear SLOs.
  • Best for dedicated capacity planning and custom infrastructure: Because the Together team can design bespoke deployments (e.g., 1,000+ GPUs in specific regions, dedicated endpoints for latency-sensitive workloads, or generative media pipelines via Dedicated Container Inference).

Limitations & Considerations

  • Not a support channel for existing accounts:
    If you already have Together products or subscriptions and need troubleshooting or billing help, use Contact Support instead of the sales form. The Contact Sales route is optimized for new projects, expansions, and enterprise planning.

  • Best for multi-stakeholder or production-grade projects:
    For casual experimentation, you can often start directly with Together Sandbox, serverless endpoints, or self-serve GPU Clusters. Sales engagement adds the most value when you have defined workloads, SLOs, security needs, or procurement processes.


Pricing & Plans

together.ai does not publish a single, one-size-fits-all “enterprise price” because cost depends on:

  • Deployment mode (Serverless Inference vs Batch vs Dedicated Model Inference vs Dedicated Container Inference vs GPU Clusters)
  • Models (open-source vs partner models, context length, modalities)
  • Traffic patterns (steady vs bursty, real-time vs batch)
  • Scale (from a few million tokens/day up to 30+ billion tokens in batch, or 1,000+ GPUs)

The Contact Sales path is how you get:

  • An enterprise trial: To validate performance and economics with your own workloads.
  • Custom model or solution design: For fine-tuned models, generative media pipelines, or non-standard runtimes.
  • Factory-scale infrastructure: For 1,000+ GPU deployments in USA and Europe, aligned to your data location and compliance needs.

Examples of how plans tend to map:

  • Growth / Engineering-Led Plan: Best for teams needing OpenAI-compatible integration with a mix of serverless and a few dedicated endpoints, plus optional fine-tuning.
  • Enterprise / Platform Plan: Best for organizations needing strict SLOs, multi-region Dedicated Inference, large GPU Clusters, and tight integration with procurement and security (SOC 2 Type II, tenant-level isolation, data ownership requirements).

The exact naming and price points evolve, so the Contact Sales form is the canonical way to get current pricing.


Frequently Asked Questions

How do I actually contact together.ai sales today?

Short Answer: Fill out the Contact Sales form at https://www.together.ai/contact-sales with your company details, products of interest, and project stage.

Details:
On the Contact Sales page, you’ll be asked for:

  • First name, last name
  • Company email
  • Company location (North America, EMEA, APAC, LATAM)
  • Which products you’re interested in:
    • Inference
    • Fine-Tuning
    • GPU Clusters
    • Code Sandbox
    • Don’t Know / Other
  • Stage of your AI project:
    • Just experimenting
    • Building a prototype
    • In active development
    • Near launch
    • Already in production
  • Optional free-text details about your use case

Once submitted, the team will reach out to schedule a call or share next steps, which typically includes a tailored walkthrough and early architecture/pricing discussion.


Can I discuss dedicated capacity, region selection, and procurement in the same call?

Short Answer: Yes. The sales and solutions team can cover capacity sizing, regional deployment, pricing, and procurement flows in one engagement.

Details:
During the discovery and follow-up sessions, you can:

  • Plan dedicated capacity

    • Choose between:
      • Dedicated Model Inference for predictable or steady traffic and latency-sensitive applications.
      • Dedicated Container Inference for generative media, custom runtimes, or complex pipelines.
      • GPU Clusters for training, large-scale batch inference, or custom workloads.
    • Discuss target throughput, tokens/sec, and concurrency.
  • Select regions and availability zones

    • Together can deploy across 25+ cities, with:
      • 2GW+ in the USA (600MW near-term)
      • 150MW+ in Europe (UK, Spain, France, Portugal, Iceland, and others)
    • You can align deployments with data residency and latency requirements.
  • Align with procurement and security

    • Work through SOC 2 Type II attestation, data and model ownership language (“Your data and models remain fully under your ownership”), encryption in transit/at rest, and tenant-level isolation.
    • Coordinate MSAs, NDAs, and PO-based billing, including volume commitments where appropriate.

Summary

To contact together.ai sales for enterprise pricing, dedicated capacity planning, and procurement, the path is straightforward: submit the Contact Sales form on the Together website with your company details, product interests, and project stage. From there, the sales and solutions engineering team will help you:

  • Map workloads to the right deployment modes (Serverless Inference, Batch Inference, Dedicated Model or Container Inference, GPU Clusters).
  • Size and price dedicated capacity, from single endpoints to 1,000+ GPU factories.
  • Align with procurement, security, and compliance requirements while preserving data and model ownership.

This is the fastest way to go from “we’re experimenting” to “we have a clear, production-ready AI Native Cloud plan with predictable costs.”


Next Step

Get Started