
Cloudflare Workers pricing: how do I estimate monthly cost from requests and CPU time?
Most teams looking at Cloudflare Workers pricing want the same thing: a simple way to turn “X requests and Y ms of CPU per request” into “≈$Z per month.” You can get very close using a few basic assumptions and Cloudflare’s billing model for requests and CPU time.
Below, I’ll walk through how Workers pricing works, how to build a quick cost model, and how to pressure-test that estimate before you ship something at scale.
Note: Cloudflare’s exact prices and limits can change. Always cross-check with the live Cloudflare Workers pricing page before making hard budget commitments. Use this guide as the how not the legal source of truth.
The Quick Overview
- What It Is: Cloudflare Workers pricing is based on a combination of request count, CPU time (per-request wall-clock compute time), and any additional resources you use (KV, D1, R2, Queues, etc.), with generous free or included allowances depending on plan.
- Who It Is For: Developers, SREs, and architects deploying serverless apps, APIs, and AI-powered workloads on the Cloudflare connectivity cloud and needing predictable monthly cost estimates.
- Core Problem Solved: It gives you a way to convert operational metrics you already track—requests, latency, CPU usage—into a clear monthly spend projection so you can avoid surprise invoices at scale.
How It Works
Cloudflare Workers runs your code on Cloudflare’s global edge network. When a request hits your Worker:
- It counts as a billable request (after free/included quotas).
- The runtime tracks how much CPU time your Worker uses to execute that request.
- Additional services (Workers KV, D1, R2, Queues, etc.) may add their own usage-based charges.
From a cost-estimation perspective, you care about three primary numbers:
- Monthly requests (total hits to your Worker).
- Average CPU time per request (e.g., 5 ms, 20 ms, 50 ms).
- Your plan tier (Free, pay‑as‑you‑go, or Enterprise).
You then apply the per‑million‑request and per‑CPU‑time rates for your plan, subtract free/included quotas, and sum everything. I’ll show you how to do that in a repeatable way so you can plug in your own traffic and latency numbers.
1. Understand the basic billing units
At a high level, you’ll see Workers pricing expressed in these units:
-
Requests
- Counted per invocation of a Worker.
- Billed in blocks (e.g., per 1 million requests) after free/included usage.
-
CPU time
- “Wall-clock” CPU time used by your Worker per request.
- Often accounted per ms or per unit of compute (e.g., per GB‑second or similar internal compute unit).
- There is usually a per‑request cap; if you hit it, the Worker is terminated. That protects both you and the platform from runaway functions.
-
Other resources (optional)
- Workers KV: Billed based on number of read/writes and storage size.
- D1 / databases: Billed based on reads/writes and storage tiers.
- R2: Billed for storage plus egress/operations.
- Queues, Durable Objects, AI inference, etc.: Each has its own unit pricing.
For “requests and CPU time” specifically, you can think in this simple formula:
Estimated monthly Workers runtime cost ≈ Cost of requests + Cost of CPU time
You then optionally add:
+ storage/database/queue/AI costs (if used)
2. Choose your pricing context
Cloudflare offers:
- Free & small-business tiers: Best for low-volume apps, prototypes, personal projects. Includes free quotas of requests and CPU.
- Pay‑as‑you‑go plans: Often where serious usage starts—clear per‑unit pricing once you exceed included quotas.
- Enterprise: Custom pricing and 100% uptime SLA for the connectivity cloud, with negotiated terms for Workers usage and discounted bulk rates.
For cost estimation at scale, assume you’re on either:
- A standard paid plan (per‑unit pricing), or
- An Enterprise contract (custom, but still based on similar usage dimensions, just with committed volumes and discounts).
Your job is to:
- Use the publicly listed per‑request and per‑CPU rates as a starting point, or
- If you’re on Enterprise, plug in the rates in your contract.
Step-by-step: Estimating monthly cost from requests and CPU
Let’s walk through a generic approach you can adapt to your numbers and plan.
Step 1 – Estimate monthly request volume
Use real traffic projections if you have them. Otherwise, start with scenarios:
- Low traffic example: 2 million requests/month
- Mid traffic example: 50 million requests/month
- High traffic example: 500 million+ requests/month
If you have daily stats, multiply by ~30 for a monthly estimate:
Monthly requests ≈ Daily requests × 30
Step 2 – Estimate average CPU time per request
This is where real measurements matter. Use:
- Cloudflare dashboard → Workers analytics to check:
- Average CPU time
- P50/P95 CPU time per request
- Or run load tests to approximate CPU usage.
For a rough rule-of-thumb:
- Simple cache-fronting / header rewrite / redirect: often ~1–5 ms CPU
- Typical API request (validation + DB call): often ~5–20 ms CPU
- Compute-heavy (crypto, large JSON transforms, AI orchestration): 20–50+ ms CPU
Let’s say your Worker consistently uses 10 ms CPU per request on average.
Step 3 – Compute total monthly CPU time
Convert CPU per request to total CPU time:
Total CPU time (ms) = Monthly requests × Avg CPU time per request (ms)
Total CPU time (seconds) = Total CPU time (ms) / 1000
Example:
- Monthly requests = 50,000,000
- Avg CPU per request = 10 ms
Total CPU time (ms) = 50,000,000 × 10 = 500,000,000 ms
Total CPU time (seconds) = 500,000,000 / 1000 = 500,000 seconds
That 500,000 seconds is what you’ll apply against whatever CPU billing unit your plan uses.
Step 4 – Apply plan-specific request pricing
Check your plan’s price per 1M requests. The pattern looks like:
Billable requests = max(0, Monthly requests – included/free requests)
Request cost = (Billable requests / 1,000,000) × Price_per_1M
Example numbers (you must plug in the actual rate from the pricing page):
- Monthly requests: 50M
- Included: 10M
- Price per 1M (beyond included): $X
Billable requests = 50M – 10M = 40M
Request cost = (40M / 1,000,000) × $X = 40 × $X
If $X were $0.30, you’d be at $12 for the requests component alone.
Step 5 – Apply CPU time pricing
Next, apply your CPU time pricing. Depending on the plan, this may be expressed as:
- Per million CPU‑ms, or
- Per GB‑second / CPU‑second equivalent.
The general formula:
Billable CPU time = max(0, Total CPU time – included/free CPU time)
CPU cost = Billable CPU time × Rate_per_CPU_unit
Using our example:
- Total CPU time = 500,000 seconds
- Included CPU time = C seconds (from your plan)
- Rate per CPU second (or equivalent) = $Y
Billable CPU time = max(0, 500,000 – C)
CPU cost ≈ Billable_CPU_time × $Y
You plug in:
- C = included CPU budget
- Y = your plan’s per‑CPU‑time rate.
If your plan charges per million CPU‑ms:
Total CPU time (ms) = 500,000,000 ms
Billable CPU = max(0, 500,000,000 – included_ms)
CPU cost = (Billable_CPU / 1,000,000) × Price_per_M_CPU_ms
The exact Y value is what you grab from the live pricing page or your Enterprise order form.
Step 6 – Combine request and CPU cost
Once you’ve got both:
Estimated runtime cost (Workers) ≈ Request cost + CPU cost
Then add any optional services you’re using (KV, D1, R2, Queues, etc.):
Total estimated cost ≈ Workers runtime cost + Data store cost + Storage cost + Queue/AI cost
This gives you a defensible monthly estimate based on the two metrics you actually track: number of requests and CPU time.
Features & Benefits Breakdown
Here’s how the Workers pricing model helps you plan and control cost:
| Core Feature | What It Does | Primary Benefit |
|---|---|---|
| Usage-based Workers runtime | Bills on request count and CPU time used by your Workers | Aligns cost with actual usage; light workloads stay cheap or free, heavier workloads pay proportionally |
| Plan-based included quotas | Provides free or included allocations of requests and CPU time depending on plan | Lets you run prototypes and low-volume services with minimal or no runtime cost |
| Integrated platform pricing | Workers lives inside Cloudflare’s connectivity cloud (Application Services, Cloudflare One, Network Services, Developer Platform) | As you adopt more services—WAF, Zero Trust, R2, KV—you centralize spend and leverage shared network effects |
Ideal Use Cases
- Best for API backends and edge logic: Because you can precisely project cost from incoming traffic and measured CPU per request, you don’t need to overprovision servers or guess at VM sizing.
- Best for AI orchestration layers at the edge: Because CPU time shows you exactly how much each agentic workflow costs to run, you can tune prompts, caching, and branching logic to keep inference orchestration affordable.
Limitations & Considerations
- CPU-heavy workloads can get expensive: If you’re doing CPU‑intensive tasks (large compressions, custom crypto, heavy JSON transforms, AI token streaming logic), monitor your average CPU time per request. Consider offloading long‑running tasks to Queues, Workers with different patterns, or external services where needed.
- Free/included limits are not infinite: Free and lower-tier plans are great for dev/test, but production-scale or mission‑critical workloads should assume a paid or Enterprise plan with explicit quotas and SLAs; factor that into your estimate rather than assuming “free forever.”
Pricing & Plans
For the most accurate numbers, refer to the Cloudflare plans page and the Workers section. In practice, teams typically fall into two buckets:
-
Developer / pay‑as‑you‑go: You pay per 1M requests and per unit of CPU time beyond free quotas. Ideal if you’re just starting, expect variable traffic, or haven’t yet committed to a large contract.
-
Enterprise: You negotiate predictable Workers pricing as part of a broader connectivity cloud contract—covering security (WAF, DDoS, bot management), Zero Trust (Cloudflare One), and network services (Magic Transit, Magic WAN). Best if you’re supporting critical workloads, require a 100% uptime SLA for serving content, and want discounted bulk usage.
-
Developer / Standard Plan: Best for teams needing flexible, usage-based pricing while experimenting with Workers, APIs, AI orchestration, and internal tools.
-
Enterprise Plan: Best for organizations running global, high-traffic workloads that demand predictable spend, premium support, and integrated security and networking (WAF, Zero Trust, network firewall) on the same platform.
For detailed SLA information, Enterprise customers can reference the Enterprise support SLA and Business customers can review the Business plan SLA.
Frequently Asked Questions
How do I quickly sanity-check my Workers cost estimate?
Short Answer: Multiply your monthly requests by your average CPU time per request, convert to total CPU, then apply your plan’s per‑request and per‑CPU rates.
Details:
A simple back-of-the-envelope:
-
Requests:
- Monthly requests:
R - Included:
R_free - Price per 1M:
P_req - Request cost:
max(0, R – R_free)/1,000,000 × P_req
- Monthly requests:
-
CPU:
- Avg CPU per request (ms):
T - Total CPU ms:
R × T - Included CPU (ms):
T_free - Price per 1M CPU ms:
P_cpu - CPU cost:
max(0, R×T – T_free)/1,000,000 × P_cpu
- Avg CPU per request (ms):
Then:
Total ≈ Request cost + CPU cost
Plug in your actual plan numbers for R_free, T_free, P_req, and P_cpu.
How can I reduce my Workers bill if CPU time looks high?
Short Answer: Reduce CPU per request by short‑circuiting work, caching aggressively, and shifting long-running tasks to more appropriate services.
Details:
From a network security architect’s perspective, CPU is just another risk surface—here it’s financial, not only operational. To lower cost without losing security or performance:
- Short‑circuit early: Return from the Worker as soon as you can—for example, cache hits, quick redirects, or access control checks before heavy logic.
- Use caching wisely:
- Cache responses at the edge when possible.
- For AI and APIs, cache stable responses (e.g., reference data, static prompts).
- Avoid blocking external calls when not needed:
- Batch external API calls where feasible.
- Use Queues or event-driven designs for non‑interactive work that doesn’t need to block the response.
- Profile your code:
- Use Workers analytics to identify routes or functions with higher CPU time.
- Optimize those hot paths first—this is usually where you get the biggest savings.
- Right-size logic per request:
- Push heavy data processing, training, or batch jobs into purpose‑built systems, then let Workers handle orchestration and policy enforcement at the edge.
Each millisecond saved per request scales linearly with your request volume, so even small improvements can translate into large cost reductions at tens or hundreds of millions of requests per month.
Summary
To estimate Cloudflare Workers pricing from requests and CPU time, you don’t need a perfect model—you need a transparent one:
- Estimate monthly requests.
- Measure or approximate average CPU time per request.
- Calculate total CPU ms/seconds.
- Apply your plan’s per‑request and per‑CPU rates, subtracting free/included quotas.
- Add any storage/database/queue/AI costs on top.
This ties your monthly bill directly to concrete runtime behavior you can observe and optimize. And because Workers runs on Cloudflare’s connectivity cloud, you’re not just paying for compute—you’re paying for security (WAF, DDoS, bot protection), Zero Trust access, and global performance that all ride the same edge network.