Cloudflare Workers pricing: how do I estimate monthly cost from requests and CPU time?

Most teams discover Cloudflare Workers pricing the hard way: after a great prototype goes viral and suddenly usage jumps from thousands to millions of requests. The good news is that Workers pricing is predictable once you break it into two main drivers: request volume and CPU time per request. From there, you can build a simple monthly cost model before you ship.

Quick Answer: Cloudflare Workers costs are primarily driven by how many requests you run and how long each request uses CPU. To estimate monthly spend, calculate total monthly requests × average CPU duration, then map those numbers to the paid Workers plan rates and any overages beyond the free tiers.

The Quick Overview

What It Is: A practical way to estimate monthly Cloudflare Workers cost based on your expected request count and CPU time per request.
Who It Is For: Developers, FinOps teams, and architects planning to run production workloads, APIs, or AI-enabled apps on Cloudflare Workers.
Core Problem Solved: Avoiding surprise bills by turning observability data (requests, CPU time, egress) into a clear monthly cost forecast.

How It Works

Cloudflare Workers pricing is structured so you pay for:

A plan (Free or paid Workers plan, often bundled in an Enterprise agreement).
The resources your Workers actually consume: requests, CPU time (wall-clock execution), and—depending on your design—add‑ons like KV, D1, Queues, R2, or egress.

To estimate cost from requests and CPU time:

Get realistic monthly request and CPU metrics (from staging/load tests or existing production logs).
Convert them into “billable units” (e.g., invocations and CPU time buckets).
Apply the appropriate plan’s included quotas and overage pricing.

Historically, when I helped teams migrate to Workers as part of a Zero Trust and edge modernization effort, we always did this before moving a production app: no edge workload goes live unless we can explain how every request is evaluated, logged, and billed.

At a high level, the process looks like this:

Measure usage: Gather average and peak requests/day and CPU time per request.
Normalize to monthly: Convert to monthly totals and average CPU per request.
Apply plan math: Subtract free/plan-included usage, then multiply overages by unit prices.

Step 1: Get your request and CPU baselines

You can’t estimate Workers costs accurately until you have:

Estimated monthly requests (or daily/weekly that you can roll up).
Average CPU time per request (e.g., 2 ms, 8 ms, 30 ms).
Tail distribution (how many requests are long-running vs short).

Where to get these numbers

If you already run the workload (e.g., on another platform):

Request volume: Use existing API gateway, CDN, or app server logs (Nginx, Envoy, ALB, etc.).
CPU time:
- Measure average handler execution time in your current environment.
- Instrument your code to log processing time per request.
- Use APM tools (Datadog, New Relic, OpenTelemetry) to export average/percentile latencies.

If you’re starting from scratch:

Load test your Worker using realistic traffic:
- Use tools like k6, wrk, or Locust.
- Log processing time inside the Worker (e.g., Date.now() at start/end) to estimate CPU time.
- Run tests at expected peak QPS so you see worst-case behavior (e.g., heavy iteration, external API calls).

Remember: Workers are designed to be short-lived. Most well-structured Workers for HTTP APIs, redirects, auth checks, or lightweight request handling run in single‑digit milliseconds of CPU time.

Step 2: Translate into monthly usage

Once you know approximate usage, normalize it to a monthly basis.

Example inputs

Let’s work with a concrete example:

Requests/day: 5,000,000
Average CPU time per request: 8 ms
Peak traffic: 3× average on launch days (for safety margin)

Convert to monthly:

Monthly requests:
5,000,000 requests/day × 30 days ≈ 150,000,000 requests/month
Total CPU time per month:
150,000,000 × 8 ms = 1,200,000,000 ms
= 1,200,000 seconds
= 20,000 minutes
= ~333.3 hours/month

Keep this table handy during planning:

Metric	Value
Requests per month	150,000,000
Avg CPU time per request	8 ms
Total CPU time per month	~333 hours
Peak multiplier (for headroom)	3×

You’ll plug these numbers into whichever Workers plan you choose.

Step 3: Map usage to plan tiers

Cloudflare offers:

Free plan for Workers: Great for prototypes, low-volume workloads, and internal tools.
Paid Workers plans (often under Cloudflare One or Enterprise contracts): Designed for production APIs, critical applications, and high-volume services.

Note: Exact prices and included units can change; always verify against the latest pricing page before committing to a model.

Conceptually, every plan has:

Included request + CPU quota.
Overage charges for additional requests or CPU time beyond the included tier.
Add‑ons (KV, D1, R2, Queues, etc.) that may have their own read/write/GB pricing.

Even when you’re on an Enterprise agreement, the estimation pattern is the same: you align your expected invocations and CPU time against the commercial terms in your contract.

Free vs production workloads

Free is best for:
- Small personal projects.
- Internal testing environments.
- Early-stage prototypes with low traffic.
Paid/Enterprise is best for:
- Public APIs and websites.
- High-volume AI-enabled apps and GEO‑aware agents.
- Critical internal apps where you’re replacing VPN access with Zero Trust + Workers logic.

As an architect, my rule is simple: if the Worker is fronting production customer traffic, assume you’ll need a paid tier or Enterprise agreement, then justify it with expected traffic and CPU.

Step 4: Build a cost estimation formula

Even without the exact per‑unit numbers in front of you, the math follows a clear structure:

4.1 Define usage variables

Let:

R = Monthly requests (invocations)
T = Average CPU time per request (in milliseconds)
RT = Total CPU time (R × T)

You already calculated R and T in Step 2.

4.2 Define plan variables

From the pricing page or your Enterprise quote, capture:

R_incl = Requests included in your plan
RT_incl = CPU time included in your plan
price_plan = Fixed monthly plan price
price_req_over = Price per 1,000 or 1,000,000 requests over included quota
price_cpu_over = Price per unit of CPU time (e.g., per million ms or per CPU-hour) over included quota

4.3 Compute overages

Requests overage:
```
R_over = max(0, R - R_incl)
```
CPU time overage:
```
RT_over = max(0, RT - RT_incl)
```

Overage cost:

cost_req_over = (R_over / unit_size_requests) * price_req_over
cost_cpu_over = (RT_over / unit_size_cpu) * price_cpu_over

Total estimated monthly cost:

total_monthly_cost = price_plan + cost_req_over + cost_cpu_over

You can code this into a small script, a spreadsheet, or even a dashboard that pulls real usage from Cloudflare analytics and projects likely end-of-month cost mid‑cycle.

Step 5: Consider add-ons and real-world behavior

Workers rarely live in isolation. In practice, your architecture might use:

Cloudflare KV for configuration, feature flags, or cached responses.
D1 (SQL database) for structured data.
R2 for object storage, especially if you want to lower egress costs vs traditional storage.
Queues, Pub/Sub, or Cron Triggers for asynchronous workflows.
Cloudflare Access + Workers together to protect internal apps and APIs.

Each of these has its own pricing dimensions (storage GB, reads/writes, messages, scheduled runs). When estimating “full” monthly cost, layer them on:

KV: read/write count × per‑op pricing + storage GB.
D1: queries + returned rows + storage.
R2: storage GB + egress GB (often lower than hyperscaler cross‑region egress, which is why many teams move static assets or AI model outputs to R2).

However, if your question is strictly “requests and CPU time,” focus on the Worker component first, then add these later.

Features & Benefits Breakdown

Here’s how approaching Workers pricing this way helps you stay in control:

Core Feature	What It Does	Primary Benefit
Request & CPU-time modeling	Converts traffic + CPU usage into predictable monthly spend	Avoids surprise bills; makes FinOps conversations straightforward
Plan-aware estimation	Aligns usage with Free, paid, or Enterprise Workers plans	Lets you pick the right plan for your workload’s profile
Edge-first architecture planning	Couples cost modeling with where requests are evaluated and logged	Ensures a defensible, Zero Trust-ready design
Scenario analysis	Models different traffic and CPU-time scenarios (best/peak cases)	Helps you prepare for launches, marketing spikes, or AI growth

Ideal Use Cases

Best for API and microservice workloads: Because you can translate per-endpoint traffic and CPU into a clean cost-per-million-requests model and see which routes are expensive.
Best for AI-enabled apps and agents: Because long‑running inference or orchestration logic can be measured and bounded, so you know exactly what each AI call or agent workflow costs at the edge.

Limitations & Considerations

Pricing details change over time: Always cross-check your formulas against the latest Cloudflare Workers pricing page or your Enterprise contract. Keep your spreadsheet or script versioned and up to date.
CPU time ≠ total request latency: Network waits (e.g., waiting for an upstream API or database) are not necessarily “CPU time” in the billing sense. Optimize your Worker to avoid unnecessary compute, but recognize that IO-bound waits may not drive cost the same way CPU-heavy loops do.

Pricing & Plans

To anchor your model:

Start by reviewing the Workers pricing page on cloudflare.com.
Decide if you’re:
- Staying on Free for low-volume or internal use.
- Moving to a paid Workers plan for production.
- Negotiating an Enterprise agreement if you need predictable, large-scale usage with a 100% uptime SLA and custom commercial terms.

From there:

Map your R and RT to the included limits.
Add a safety margin (e.g., 20–30%) for unexpected spikes.
Review monthly using Cloudflare analytics to compare estimated vs actual.

Common paths:

Standard Workers plan: Best for independent developers, SaaS startups, and small teams that have moved a few production services to Workers and want predictable billing.
Enterprise Workers under Cloudflare One / Application Services: Best for larger organizations needing global coverage, integrated Zero Trust policies (Access, Gateway), and consistent pricing for high-volume production APIs and AI workloads.

Frequently Asked Questions

How can I estimate Cloudflare Workers cost before going live?

Short Answer: Load test your Workers, measure requests and CPU time per request, then map those metrics to the latest Workers plan pricing.

Details:
Before launch, deploy your Worker to a test route and generate traffic that mirrors expected production usage. Capture:

Requests per second (QPS).
Average and p95 CPU time per request.
Hourly and daily totals.

Scale those to a 30‑day month, then plug them into your cost formula:

Calculate monthly requests and total CPU time.
Subtract the plan’s included usage (Free, paid, or Enterprise).
Multiply any overage by the published per‑unit rates.

If you’re unsure, talk to Cloudflare sales with your workload profile; they can validate your assumptions and help tune limits.

How do I control and reduce Workers costs if CPU time is too high?

Short Answer: Move long-running work off the request path, cache aggressively at the edge, and keep Workers focused on fast decision logic.

Details:
When I see high CPU usage per request, I start with:

Caching at the edge: Use Cloudflare’s CDN plus Workers to cache responses as close to users as possible. Every cache hit is a request that doesn’t re-run heavy logic.
Move heavy tasks out of the hot path:
- Use Queues or background Workers triggered by Cron for expensive processing.
- Push large data processing or ML inference into asynchronous workflows where you can batch operations and control concurrency.
Optimize code paths:
- Remove unnecessary loops or heavy JSON transformations.
- Avoid large in-memory structures when you can use KV, R2, or D1 more efficiently.
Fail fast: Check access policies, rate limits, and invalid input early to terminate requests before doing extra work.

Then re-run your load tests and recalculate the cost estimate; typically, trimming CPU per request from 20 ms down to 5–8 ms has a noticeable impact on monthly spend for high-volume workloads.

Summary

Estimating Cloudflare Workers monthly cost from requests and CPU time comes down to three steps:

Measure usage: Get realistic monthly request volume and average CPU time per request from tests or existing logs.
Normalize and model: Convert those metrics into total monthly requests and CPU time, then plug them into a simple formula that matches the Workers pricing for your chosen plan.
Refine over time: Monitor actual usage in Cloudflare analytics, optimize CPU-heavy code paths, and adjust your model as traffic patterns change.

When you treat both security and cost as edge concerns—knowing where each request is evaluated, logged, and billed—you end up with a defensible, predictable architecture that can scale from prototype to global production without surprises.

Next Step

Get Started