
How do we configure Coder autostop and resource quotas to control dev compute costs?
Most platform teams feel dev compute costs in their cloud bill long before finance does. With Coder, the two levers that matter most are autostop (idle shutdown) and resource quotas (hard limits on what any workspace can consume). Configure those correctly and you’ll cut waste without slowing developers down.
Quick Answer: Use Coder’s workspace autostop to shut down idle workspaces and apply per-template and per-user resource quotas through Terraform-based templates and organization policies. Together, these enforce predictable CPU, RAM, and GPU usage while automatically turning off unused environments to control dev compute costs.
Frequently Asked Questions
How does Coder autostop help control dev compute costs?
Short Answer: Autostop shuts down idle Coder workspaces after a configured period, so you’re not paying for CPU, RAM, or GPUs when nobody is using them.
Expanded Explanation:
Coder workspaces run on your infrastructure (Kubernetes or VMs in AWS, Azure, GCP, or on‑prem). If you leave them running 24/7, your cloud bill will reflect that, especially for GPU-backed or large-CPU nodes. Autostop adds a governance layer: you define idle timeouts and stop behavior so workspaces automatically power down when a developer or AI agent isn’t actively using them.
You still keep workspace state and data inside your infrastructure; autostop only stops the compute. When a developer reconnects through VS Code Remote, JetBrains Gateway, a browser IDE, or an AI-first editor like Cursor or Windsurf, Coder spins the workspace back up in seconds. The result is a “local machine” feel with cloud cost behavior that matches actual usage, not wishful thinking.
Key Takeaways:
- Autostop converts always-on workspaces into “pay for active usage” environments.
- You tune idle timers by template or policy to match real workflows (e.g., longer for GPU training, shorter for general dev).
How do we configure Coder autostop policies step-by-step?
Short Answer: Define autostop settings at the template and/or organization level, then roll those templates out so every workspace is created with predictable idle-stop behavior.
Expanded Explanation:
Coder represents workspaces as Terraform. That means autostop isn’t a one-off toggle in someone’s UI profile; it’s part of the workspace definition you can review, version, and standardize. As a platform engineer, you decide which templates get aggressive idle-stop (e.g., bursty CI-like dev environments) and which get more relaxed timeouts (e.g., long-running debugging or ML jobs).
You can combine template-level settings with organization policies: for example, “no workspace can have an idle timeout longer than 12 hours,” or “GPU templates require shorter autostop windows.” This keeps teams from bypassing controls while still letting you offer sensible defaults per use case.
Steps:
- Decide idle-stop targets by template type
- General dev: 30–60 minutes idle timeout.
- Heavy backend / integration: 2–4 hours.
- GPU/ML: 1–2 hours with stricter quotas.
- Encode autostop in Terraform templates
- Add autostop/idle timeout fields to your Coder workspace templates as Terraform variables or fixed values.
- Commit these templates to your Git-backed template repo for review.
- Apply and roll out templates
- Update your Coder deployment to point to the new templates.
- Migrate existing workspaces or encourage recreation from the new templates so all environments inherit the autostop behavior.
What’s the difference between autostop and resource quotas in Coder?
Short Answer: Autostop controls when compute runs (turning off idle workspaces), while resource quotas control how much compute a workspace is allowed to use.
Expanded Explanation:
Think of autostop as time-based governance and quotas as size-based governance. Autostop keeps idle workspaces from burning money overnight or over weekends. Quotas keep any single developer or AI agent from spinning up oversized instances or hoarding GPUs beyond what you intended.
In Coder, both are driven through Terraform-defined templates and policies. Platform teams build “golden path” templates for small, medium, and GPU-heavy workspaces with fixed or parameterized CPU/RAM/GPU limits, then cap those with org-level quotas. Developers self-serve from these options in seconds, but they can’t escape the boundaries you’ve defined.
Comparison Snapshot:
- Option A: Autostop
- Controls runtime based on inactivity.
- Best lever for eliminating idle spend.
- Option B: Resource quotas
- Caps CPU, RAM, GPU, and sometimes workspace count.
- Best lever for preventing overprovisioning and runaway cost per developer.
- Best for:
- Use autostop to cut waste from idle environments.
- Use quotas to enforce maximum shape/size and keep costs predictable per user/team.
How do we implement Coder resource quotas to limit compute usage?
Short Answer: Standardize workspace sizes in Terraform templates, then enforce per-user or per-team quotas on CPU, RAM, GPU, and workspace counts at the organization level.
Expanded Explanation:
Coder isn’t your cloud IaC or your CI/CD system; it’s the control plane for dev workspaces. That’s where quotas belong. You model workspace shapes as code—“2 vCPU / 4 GiB,” “8 vCPU / 32 GiB with 1 GPU,” and so on—and publish them as templates. Developers and approved AI agents can provision these in seconds, but they can’t exceed the limits baked into the template and governing policies.
From there, you set quotas that limit the total number of running workspaces and aggregate resources per user or group. This is where you prevent “one power user” from spinning up five GPU rigs and leaving them on all week. Combined with autostop, quota policies give you a predictable upper bound on dev infrastructure spend, even in large organizations.
What You Need:
- Terraform-based Coder templates
- Encode CPU, RAM, GPU, and storage sizes as explicit values or constrained variables.
- Maintain separate templates for general dev, high-memory, and GPU workloads.
- Org-level governance rules
- Policies that cap per-user workspace counts and enforce maximum allowable instance size.
- A review loop with security/finance to tune these caps as you observe real usage.
How should we combine autostop and quotas for sustainable cost control?
Short Answer: Use quotas to set hard ceilings on what can be provisioned, and autostop to make sure those resources only run when actively used—this combination keeps dev flexible but keeps your bill bounded.
Expanded Explanation:
From an operator’s perspective, cost control is about guardrails, not individual heroics. Coder gives you two powerful guardrails that live entirely inside your infrastructure: resource quotas define the maximum blast radius; autostop ensures that even within that radius, you pay only for active work.
Onboarding speed stays high—developers still self-serve workspaces in seconds from your Terraform templates, choosing their IDE and OS while keeping code and data off local laptops. Security and platform teams get centralized control through the coderd control plane, OIDC SSO, and RBAC, plus the ability to route audit logs into your SIEM. In regulated environments (including air-gapped and classified deployments), this means you can scale remote development and AI-assisted coding without blowing your budget or losing control of compute, access, or context.
Why It Matters:
- Predictable spend at scale
- Autostop removes silent idle waste.
- Quotas cap the maximum resource footprint per user, team, and template.
- Governed velocity
- Developers gain fast, reproducible workspaces with their preferred tools.
- Platform and security teams retain full ownership over where workspaces run, how big they are, and how long they stay on.
Quick Recap
Coder runs as a self-hosted, open source remote development control plane on your infrastructure, not as a SaaS. To control dev compute costs, you configure two things as code: autostop (to shut down idle workspaces automatically) and resource quotas (to limit CPU, RAM, GPU, and workspace counts). You express both in Terraform-backed templates and org policies, then let developers and AI coding agents self-serve governed workspaces in seconds. The outcome is faster onboarding, less “works on my machine” drift, and a cloud bill that tracks real usage instead of idle capacity.