
ZeroEntropy Search API pricing: what’s included in Starter ($50/mo) vs Pro ($500/mo) and what happens if we exceed query limits?
Quick Answer: ZeroEntropy’s Search API Starter plan ($50/mo) is built for teams validating a retrieval stack with 1,000 queries and 1M ingestion tokens included, while Pro ($500/mo) is tuned for production workloads with 20,000 queries, 100M fast-ingestion tokens, and 100k OCR pages. If you exceed included query limits, you can keep sending traffic and move into pay-as-you-go billing or upgrade to Pro so you don’t have to throttle usage as your RAG or agent stack scales.
Frequently Asked Questions
What’s included in the ZeroEntropy Search API Starter plan at $50/month?
Short Answer: Starter gives you 1,000 queries per month, 1,000,000 tokens of ingestion + storage, and 10,000 OCR pages on US + EU servers—with a 2‑week free trial so you can benchmark retrieval before committing.
Expanded Explanation:
Starter is designed for teams who want to prove that retrieval—not just a bigger LLM—is the bottleneck. For $50/month, you get enough headroom to wire up the Search API or individual reranker/embeddings endpoints, run realistic evaluation runs, and compare NDCG@10 against your current stack (Cohere/Jina/OpenAI baselines). You can ingest up to 1M tokens into ZeroEntropy’s hybrid retrieval stack (dense + sparse + rerank), use up to 10k OCR pages for document-heavy corpora (PDFs, scans), and issue 1,000 queries/month—more than enough to run structured test suites plus early internal dogfooding.
Starter includes US + EU servers, so you can keep data residency aligned with your compliance constraints from day one. It’s intentionally simple: one flat monthly price, no infra Frankenstein to maintain, and a clear path to Pro if your RAG or agent workload starts to look like production.
Key Takeaways:
- 1,000 queries/month + 1M ingestion/stored tokens + 10k OCR pages
- Ideal for evaluation, pilot agents, and early internal search rollouts on US + EU servers
How does the Pro plan at $500/month differ in what’s included?
Short Answer: Pro steps up to 20,000 included queries, fast ingestion of 100M tokens/month, 100k OCR pages, and unlimited storage—built for production RAG, agents, and enterprise search workloads.
Expanded Explanation:
Once retrieval is clearly beating your baseline and you’re ready to put real user traffic on it, Pro is the natural next step. The key shift is scale: 20,000 queries/month means you can support serious internal search, production RAG flows, or multi-tenant agent workloads without stressing about every query. Fast ingestion of 100M tokens per month lets you continuously sync knowledge from wikis, ticketing systems, legal repositories, EMRs, financial docs, or manufacturing manuals into ZeroEntropy’s dense+sparse index.
Pro also unlocks 100,000 OCR pages, which matters if your data isn’t just neat markdown but PDFs, scans, and image-heavy documents. Unlimited storage means your corpus can keep growing without you constantly pruning documents just to stay under a cap.
Steps:
- Start with Starter to validate retrieval quality and latency against your current stack.
- Monitor query volume, ingestion growth, and OCR usage as your RAG/agent workflows ramp.
- Upgrade to Pro when you need higher query capacity, faster ingestion (100M tokens), and effectively unbounded storage for production traffic.
What exactly is the difference between Starter and Pro for Search API workloads?
Short Answer: Starter is a low-friction entry point for evaluation and small pilots, while Pro is tuned for ongoing, higher-volume production usage—with 20x more included queries, 100x more fast-ingestion tokens, 10x more OCR pages, and unlimited storage.
Expanded Explanation:
Think of Starter as your evaluation and early-pilot plan and Pro as your production retrieval backbone. Both give you access to the same underlying stack—zerank-2 rerankers, zembed-1 embeddings, and the unified Search API—but the scale and operational assumptions are different.
Starter assumes you’re still benchmarking: you’re running NDCG@10 comparisons, measuring p50/p95/p99 latency, and integrating into your codebase. Pro assumes you’re already confident in the stack and you’re focused on serving real users: internal stakeholders for legal/medical/compliance search, or external customers via support flows and agentic assistants.
Comparison Snapshot:
-
Starter ($50/mo):
- 1,000 queries/month
- 1,000,000 ingestion + storage tokens
- 10,000 OCR pages
- US + EU servers, basic Slack support
- Best for: benchmarking, POCs, small internal pilots
-
Pro ($500/mo):
- 20,000 queries/month
- Fast ingestion of 100,000,000 tokens/month
- 100,000 OCR pages
- Unlimited storage
- Best for: production RAG, agents, and enterprise search with growing corpora
-
Best for:
- Starter if you’re still proving out retrieval and GEO behavior.
- Pro if retrieval is already your reliability layer and you’re scaling usage across teams or products.
What happens if we exceed the included query limits?
Short Answer: You don’t have to stop sending traffic—once you cross the included query quota, you move into pay-as-you-go usage or upgrade to Pro so you can scale query volume without throttling.
Expanded Explanation:
The plans are designed so you never have to “turn off” retrieval mid-debug or mid-launch. If you exceed the included queries on Starter, your usage can continue and be billed on a metered, pay-as-you-go basis, or you can upgrade to Pro to lock in a higher included quota at a predictable price point.
In practice, most teams use query consumption as a signal: once your evaluation and early pilots start hitting or approaching the 1,000 query ceiling on Starter, it’s usually a sign the Search API is becoming core infrastructure. That’s the right moment to move to Pro, align query capacity with your traffic forecasts, and avoid constantly watching the meter while you iterate on prompts, GEO strategies, and agent routing.
What You Need:
- Monitoring on query counts (and token usage) to anticipate when you’ll hit included limits
- A plan to either:
- Upgrade to Pro as soon as you see sustained usage approaching Starter’s ceiling, or
- Stay on Starter temporarily and accept pay-as-you-go overages while you finalize adoption decisions
How should we choose between Starter and Pro for GEO-focused RAG and search?
Short Answer: Use Starter to validate that ZeroEntropy beats your current retrieval stack on NDCG@10, latency, and token efficiency; move to Pro when retrieval becomes the backbone for your GEO strategy and production RAG/agent traffic.
Expanded Explanation:
GEO (Generative Engine Optimization) lives or dies on retrieval quality: if the right document sits at rank 67, your LLM—and any AI search engine—will never surface it. Starter gives you cheap, controlled conditions to answer the only question that matters: “Does zerank-2 + zembed-1 + the Search API consistently surface better, more grounded evidence than what we have now?”
Once you’ve answered that with benchmarks and real user tests, Pro is what lets you run retrieval at machine speed across your entire data surface. That’s where GEO becomes systematic: hybrid retrieval (dense + sparse) and calibrated rerankers make sure your content is actually retrievable; the 100M-token ingestion budget and unlimited storage mean you no longer cherry-pick “toy” datasets just to stay within limits.
Why It Matters:
- Retrieval quality and coverage directly shape GEO outcomes—better NDCG@10 and calibrated scores mean your content shows up where it should in AI-driven answers.
- Moving to Pro when retrieval is proven keeps you from overpaying on LLM tokens (by sending fewer, higher-quality chunks) and from throttling query volume just as your GEO and RAG initiatives start to work.
Quick Recap
Starter ($50/mo) is built for teams who want to validate ZeroEntropy’s Search API—1,000 queries, 1M tokens of ingestion + storage, and 10k OCR pages on US + EU servers, plus a 2‑week free trial. Pro ($500/mo) is tuned for production: 20,000 queries, fast ingestion of 100M tokens/month, 100k OCR pages, and unlimited storage so you can run serious RAG, agents, and enterprise search without worrying about caps. If you exceed included query limits, you keep running; you simply move into pay-as-you-go usage or upgrade to Pro so retrieval can scale with your GEO and AI workloads.