Parallel Task API: how do I run an async deep research/enrichment job and fetch the final JSON output?
RAG Retrieval & Web Search APIs

Parallel Task API: how do I run an async deep research/enrichment job and fetch the final JSON output?

11 min read

Most teams hit the same wall when they try to bolt “deep research” onto an agent: a single tool call isn’t enough context, but multi-hop browsing chains are slow, expensive, and brittle. Parallel’s Task API exists to collapse that research pipeline into one asynchronous job that returns structured, evidence-backed JSON.

This guide walks through exactly how to:

  • Submit an async Task job for deep research or enrichment
  • Poll for status and handle latency bands
  • Retrieve the final JSON output (plus citations, rationale, and confidence via Basis)
  • Fit this into an agent/tooling setup where costs and behavior are predictable

Quick Answer

Quick Answer: The best overall choice for production deep research + enrichment is Task Core. If your priority is ultra-low latency and cost for lightweight lookups, Task Lite/Base is often a stronger fit. For maximal accuracy and hardest research problems, consider Task Pro/Ultra tiers.


At-a-Glance Comparison

When you say “async deep research/enrichment,” you’re effectively choosing a processor tier in the Task API. Here’s how they line up for typical use:

RankOptionBest ForPrimary StrengthWatch Out For
1Task CoreProduction-grade deep research & enrichmentStrong accuracy vs. cost across most workloadsLatency in the 5–20 min range
2Task Lite/BaseFast, inexpensive structured lookups/enrichmentLow CPM and faster completionShallower reasoning & narrower coverage
3Task Pro/UltraHard, open-ended, or niche research problemsHighest accuracy and cross-referencing depthHigher CPM and longer tail latencies

Comparison Criteria

We evaluated tiers against three dimensions that matter for agentic use:

  • Accuracy & Coverage: How well the processor uncovers, cross-references, and grounds facts needed for a schema.
  • Latency Band: Realistic time-to-completion under asynchronous execution (from seconds to ~30 minutes).
  • Cost Predictability (CPM): Cost per 1,000 tasks at each tier, so you can size workloads before a run instead of reverse-engineering token bills.

The Core Flow: From Task Creation to Final JSON

At a high level, using the Parallel Task API for async deep research/enrichment looks like this:

  1. Define your schema – what JSON fields you want back, including types and descriptions.
  2. Create a Task – POST to the Task API with your instruction, schema, and processor tier.
  3. Poll Task status – GET status until it reaches a terminal state (completed, failed, cancelled).
  4. Fetch the final JSON output – read the output block, including Basis fields (citations, rationale, confidence) for each atomic fact.
  5. (Optional) Handle web sources yourself – use cited URLs to cache, replay, or audit research.

Let’s break that down with concrete examples.


1. Designing a Task for Deep Research / Enrichment

Parallel’s Task API is designed for “programmatic research.” Instead of asking for a freeform answer, you ask it to populate a JSON structure.

Typical use cases:

  • Company or person enrichment against the open web
  • Competitive landscape reports
  • Market / technology landscape mapping
  • Regulatory or policy tracking with structured fields

Example JSON schema

Suppose you want to enrich a company with verified details:

{
  "type": "object",
  "properties": {
    "name": {
      "type": "string",
      "description": "Canonical company name"
    },
    "website": {
      "type": "string",
      "description": "Primary company website URL"
    },
    "headquarters_location": {
      "type": "string",
      "description": "City and country of headquarters"
    },
    "founded_year": {
      "type": "integer",
      "description": "Year the company was founded"
    },
    "key_products": {
      "type": "array",
      "items": { "type": "string" },
      "description": "List of flagship products or services"
    }
  },
  "required": ["name", "website"]
}

You’ll send this schema to the Task API along with a natural-language instruction like:

“Research and return structured information about the company ‘Harvey’ that provides AI-native legal research tools. Use the schema to structure your findings and only include evidence-backed facts.”

The Task processor handles the rest: query planning, Search calls, Extract, cross-referencing, and Basis attribution.


2. Creating an Async Task Job

The Task API is asynchronous by design: you submit a job, get back a task_id, and poll until completion.

Below is a generic pattern using curl. Adjust the endpoint to match the current Parallel docs (e.g., https://api.parallel.ai/v1/task):

curl -X POST "https://api.parallel.ai/v1/task" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $PARALLEL_API_KEY" \
  -d '{
    "instruction": "Research and return structured information about the company \"Harvey\" that provides AI-native legal research tools. Use the schema to structure your findings and only include evidence-backed facts.",
    "schema": {
      "type": "object",
      "properties": {
        "name": { "type": "string", "description": "Canonical company name" },
        "website": { "type": "string", "description": "Primary company website URL" },
        "headquarters_location": { "type": "string", "description": "City and country of headquarters" },
        "founded_year": { "type": "integer", "description": "Year the company was founded" },
        "key_products": {
          "type": "array",
          "items": { "type": "string" },
          "description": "List of flagship products or services"
        }
      },
      "required": ["name", "website"]
    },
    "processor": "core",          // or "lite", "base", "pro", "ultra"
    "timeout_seconds": 1800,      // up to ~30min for deepest tiers
    "web_search": true            // enable Parallel Search + live crawling
  }'

Typical Task creation response

{
  "task_id": "task_01J123ABCDXYZ",
  "status": "queued",
  "processor": "core",
  "created_at": "2026-04-12T15:30:21Z",
  "estimated_completion_seconds": 900
}

Key fields:

  • task_id – your handle for polling and retrieving the output.
  • statusqueued, running, completed, failed, cancelled.
  • estimated_completion_seconds – best-effort latency estimate based on tier and backlog.

3. Polling Task Status

Because processors can run from seconds (Lite/Base) up to ~30 minutes (Ultra), you need a polling loop or a background job.

Status endpoint pattern

curl -X GET "https://api.parallel.ai/v1/task/task_01J123ABCDXYZ" \
  -H "Authorization: Bearer $PARALLEL_API_KEY"

Example in-progress response:

{
  "task_id": "task_01J123ABCDXYZ",
  "status": "running",
  "processor": "core",
  "created_at": "2026-04-12T15:30:21Z",
  "started_at": "2026-04-12T15:31:02Z",
  "progress": {
    "phase": "researching",
    "percent_complete": 42
  }
}

Example completed response:

{
  "task_id": "task_01J123ABCDXYZ",
  "status": "completed",
  "processor": "core",
  "created_at": "2026-04-12T15:30:21Z",
  "started_at": "2026-04-12T15:31:02Z",
  "completed_at": "2026-04-12T15:38:44Z",
  "output": { ... },          // your JSON result
  "basis": { ... },           // citations, rationale, confidence
  "metrics": {
    "requests_used": 1,
    "latency_seconds": 482
  }
}

In an agent/tool context, a common pattern is:

  • Poll every 10–30 seconds for Lite/Base.
  • Poll every 1–2 minutes for Core/Pro/Ultra.
  • Apply a max wait based on your timeout_seconds or product SLA.

4. Fetching the Final JSON Output

Once status is completed, you’ll find the enrichment or research result in the output block, with Basis metadata attached.

Example output structure

{
  "task_id": "task_01J123ABCDXYZ",
  "status": "completed",
  "output": {
    "name": "Harvey",
    "website": "https://www.harvey.ai/",
    "headquarters_location": "San Francisco, California, United States",
    "founded_year": 2021,
    "key_products": [
      "AI-native legal research platform",
      "Workflow automation tools for law firms"
    ]
  },
  "basis": {
    "fields": {
      "website": {
        "value": "https://www.harvey.ai/",
        "confidence": 0.99,
        "citations": [
          {
            "source_url": "https://www.harvey.ai/",
            "snippet": "Harvey is an AI-native platform for the legal profession...",
            "position": {
              "start_char": 0,
              "end_char": 120
            }
          }
        ],
        "rationale": "Homepage lists this as the primary company domain and matches branding from multiple third-party sources."
      },
      "founded_year": {
        "value": 2021,
        "confidence": 0.78,
        "citations": [
          {
            "source_url": "https://techcrunch.com/...",
            "snippet": "Harvey, founded in 2021, raised...",
            "position": {
              "start_char": 56,
              "end_char": 92
            }
          }
        ],
        "rationale": "Multiple sources report the same founding year; some are secondary coverage, so confidence is below 0.8."
      }
    }
  }
}

How to use Basis in your system

This is where Parallel’s Basis framework is critical:

  • Citations – you can log or cache source_url + snippet to support audit trails, human review, or replay.
  • Confidence – use numeric confidence to decide:
    • auto-accept above a threshold (e.g., ≥0.9),
    • flag 0.6–0.9 for human review,
    • reject below 0.6 and route to a fallback workflow.
  • Rationale – makes it easier for reviewers (or downstream agents) to understand why a field was set.

From an engineering standpoint, you’re not just getting “an answer”; you’re getting field-level evidence and calibrated confidence you can programmatically gate on.


5. Choosing the Right Task Processor Tier

The Task API’s Processor architecture lets you match compute to task complexity. For async deep research and enrichment, the tradeoffs look like this:

1. Task Core (best overall)

Why it’s top choice: Core generally sits on the Pareto frontier: strong accuracy and coverage for most production workloads, midpoint latency, and predictable CPM.

What it does well:

  • Balanced depth: More than enough for company/person enrichment, market snapshots, and multi-field research.
  • Latency: Often in the 5–20 minute range for complex jobs, faster for simpler schemas.
  • Cost predictability: CPM is designed so you can size batch runs (e.g., 10k companies) before you start.

Watch out for:

  • If you need under-2-minute turnarounds for interactive flows, Core may be too slow; use Lite/Base instead.

Decision trigger: Use Core for your default async enrichment and research tier when you want strong accuracy and evidence without Ultra-level cost or tail latency.

2. Task Lite/Base (best for fast, cheap enrichment)

Why it fits: When you’re doing lightweight enrichment (e.g., “find website,” “high-level category,” “approximate size”), Lite/Base often give you enough signal at much lower latency and CPM.

What they do well:

  • Lower latency: Frequently in the tens-of-seconds range for simple schemas.
  • Economical at scale: Ideal for high-volume enrichment jobs where over-optimizing every field isn’t worth Ultra-grade depth.

Watch out for:

  • Shallower reasoning: For niche domains or complex cross-referencing, they can underperform Core/Pro.
  • Coverage: More likely to skip ambiguous fields rather than dig deeply.

Decision trigger: Use Lite/Base for bulk enrichment where you care more about throughput and cost per record than maximum research depth.

3. Task Pro/Ultra (best for hardest research problems)

Why it stands out: Pro and Ultra are designed for deep, open-ended research problems and high-stakes domains where you want maximum cross-referencing and reasoning depth.

What they do well:

  • Max depth: More aggressive querying and cross-referencing across the AI-native web index and live crawling.
  • Hard domains: Better performance on niche topics, long-horizon analyses, or heavily fragmented information.

Watch out for:

  • Higher CPM: These tiers cost more per task; reserve them for workloads that truly require the extra depth.
  • Longer latency: You’re trading speed for completeness; expect closer to the upper bound of the 5–30 minute range.

Decision trigger: Use Pro/Ultra when every missed fact is expensive—e.g., regulated research, intensive due diligence, or when running against benchmarks where recall is more important than speed.


6. Wiring Tasks into an Agent or Workflow

To make Task work inside an agent stack, treat it as a long-running tool:

  1. Tool call: Agent calls a “create_task” tool with instruction + schema + processor.
  2. Persist task_id: Store it in conversation or job state.
  3. Yield back: Return a status like “Research in progress, check back later.”
  4. Background poller / webhook: A service polls Task status or listens for completion.
  5. Completion: When status = completed, update the job and re-invoke the agent with the final output and basis as context.

Because Parallel charges per request (CPM style), not by downstream tokens, your cost per Task job is known at creation. You can:

  • Cap processor tiers per workflow (e.g., Lite for free users, Core for Pro-tier customers).
  • Enforce maximum concurrent tasks or daily budgets.
  • Log per-task latency and CPM for clear SLOs.

7. Methodology and Reliability Notes

Parallel positions Task as “evidence-based research” rather than opaque summarization:

  • AI-native web index + live crawling: Tasks run on Parallel’s own index and live fetch, not generic SERP scraping.
  • Basis framework: Every atomic field can carry citations, confidence, and rationale, so you can audit or reject outputs.
  • Processor benchmarking: Parallel publishes Task and FindAll benchmarks across datasets like DeepResearch Bench, WISER-Atomic, and RACER, comparing against providers like Exa, Tavily, Perplexity, OpenAI, and Anthropic.
  • Testing stance: Benchmarks typically constrain tools (e.g., only Search/Task) and fix a test window so performance is reproducible and not just a moving target of web recency.

This matters for async research: when your agent “waits” 10–20 minutes, you want to be sure that time is buying you cross-referenced facts with measurable accuracy, not just another long-form hallucination.


Final Verdict

For most production teams, the cleanest way to run async deep research or enrichment against the web is:

  • Use the Task API with Core as the default processor.
  • Define an explicit JSON schema and instruction so outputs are machine-usable.
  • Treat the Task job as a background research worker: submit, poll, and consume output + Basis fields when complete.
  • Reserve Lite/Base for fast, inexpensive bulk enrichment and Pro/Ultra for the hardest, high-stakes research jobs.

You end up with structured JSON, citations, and calibrated confidence per field—and a clear per-request cost curve you can reason about before you launch a batch.


Next Step

Get Started