How can I pause and resume a long-running AI workflow (waiting on a webhook or human approval) in a TypeScript backend?

Most teams hit the same wall: your AI workflow can call tools, send emails, and hit APIs—but as soon as you need to wait on a webhook or human approval, everything falls apart. You don’t want a Node process hanging forever, and you definitely don’t want to rebuild state machines from scratch. This is exactly where Mastra’s workflow suspend() / resume() model is designed to help in a TypeScript backend.

Quick Answer: Use Mastra’s createWorkflow with suspend() and resume() to pause a workflow at any step, persist its state, and then safely resume it later when a webhook callback or human approval arrives—without blocking your server or losing context.

Frequently Asked Questions

How do I pause an AI workflow in a TypeScript backend so it can wait for human input or a webhook?

Short Answer: Use Mastra workflows and call suspend() inside a step when you need to pause. Mastra persists the workflow run and lets you resume it later from the exact step that suspended.

Expanded Explanation:
In Mastra, a workflow is a sequence of typed steps created with createWorkflow and createStep. When a step hits a point where it needs external input—like “manager approval” or “payment provider webhook”—you call suspend({ … }) instead of returning the final output. That tells Mastra: “Stop here, persist the state, and return control to the caller.”

You can optionally attach a payload to suspend() (like a reason or instructions) that you surface in your UI or logs. Later, once you have the missing data (human decision, webhook payload, etc.), you call run.resume() on the workflow run. Mastra reloads the run, injects the resumeData you pass, and continues execution from the suspended step.

Key Takeaways:

Call suspend() inside a step to pause a workflow without blocking your server.
Mastra persists the workflow state so you can safely resume from the same step with run.resume().

What’s the process to set up pause-and-resume for a long-running workflow in Mastra?

Short Answer: Define your workflow with steps that use suspend() when they need to pause, then use createRun().start() to kick it off and run.resume() to continue when external input arrives.

Expanded Explanation:
The core process is:

Model your workflow as steps with explicit inputSchema and outputSchema using zod.
In any step that might need to wait (human approval, webhook, etc.), return suspend() with a clear reason and optional payload.
Start the workflow with workflow.createRun().start({ inputData }) in your API/server.
When the external event occurs (a POST from your frontend or a webhook provider), fetch the corresponding run and call run.resume({ step, resumeData }).

Mastra takes care of resuming the execution graph from that step. You never have to manually reconstruct state between requests; the workflow engine tracks inputs, outputs, and current position.

Steps:

Install and scaffold:

npm create mastra
# or
pnpm create mastra

Define a workflow with a suspending step:

import { createWorkflow, createStep, suspend } from '@mastra/core/workflows';
import { z } from 'zod';

const approvalStep = createStep({
  id: 'approval-step',
  inputSchema: z.object({
    userEmail: z.string().email(),
  }),
  outputSchema: z.object({
    message: z.string(),
  }),
  suspendSchema: z.object({
    reason: z.string(),
  }),
  resumeSchema: z.object({
    approved: z.boolean(),
  }),
  async run({ inputData }) {
    // pause here for human approval
    return suspend({
      reason: `Approval required for ${inputData.userEmail}`,
    });
  },
});

export const approvalWorkflow = createWorkflow({
  id: 'approvalWorkflow',
  steps: [approvalStep],
});

Start the workflow in your backend:

const workflow = mastra.getWorkflow('approvalWorkflow');

const run = await workflow.createRun();
const initialResult = await run.start({
  inputData: { userEmail: 'alex@example.com' },
});

// initialResult will contain suspend info instead of final output

Resume from a webhook or UI endpoint:

// e.g., inside an Express/Next.js route
const result = await run.resume({
  step: 'approval-step',
  resumeData: { approved: true },
});

What’s the difference between pausing for webhooks vs pausing for human approval?

Short Answer: Mechanically they’re identical—both use suspend() and resume()—but webhooks are machine-triggered while human approvals are UI-driven, so how you capture resumeData differs.

Expanded Explanation:
Mastra doesn’t care whether the missing input comes from a person or a webhook; the workflow engine just sees a step that called suspend() and then later gets resumeData. The difference is in how you wire up the resume flow and what you put in the schemas.

For webhooks, you typically have a server route that provider X calls. You validate the payload, map it into the step’s resumeSchema, and call run.resume(). For human approvals, you usually expose a frontend page or internal tool that shows the suspended run and offers “Approve / Reject” buttons; the UI calls your backend, which then calls run.resume() with { approved: true | false }.

Comparison Snapshot:

Option A: Webhook pause
- suspend() when waiting on an external system (Stripe, Slack, internal service).
- resumeData is built from a webhook payload.
Option B: Human approval pause
- suspend() when a decision or review is needed.
- resumeData comes from a UI/API where a human chooses an action.
Best for:
- Webhooks for machine-to-machine flows (payments, async jobs).
- Human approval for gated decisions (KYC, risky actions, overrides).

How do I implement multi-turn human input where the workflow pauses more than once?

Short Answer: Define multiple steps that each use suspend() with their own resumeSchema, and call run.resume() with the appropriate step and resumeData at each stage.

Expanded Explanation:
Some workflows don’t stop at a single approval; you may need “draft review → legal review → final signoff.” Mastra’s suspend pattern is repeatable across steps. Each step declares:

inputSchema: what it needs to run.
suspendSchema: what you return when pausing (usually a reason/message).
resumeSchema: what you expect when resuming (e.g., { approved: boolean; notes?: string }).

Every time a step suspends, the workflow pauses at that step id. When you’re ready to move forward, you call run.resume({ step, resumeData }). The workflow resumes from that step and progresses to the next one.

What You Need:

Per-step resumeSchema and suspendSchema to model each human interaction.
Backend routes (or handlers) that map UI actions into run.resume() calls.

Example outline:

const draftReview = createStep({
  id: 'draft-review',
  inputSchema: z.object({ draftId: z.string() }),
  outputSchema: z.object({ approved: z.boolean() }),
  suspendSchema: z.object({ reason: z.string() }),
  resumeSchema: z.object({ approved: z.boolean() }),
  async run() {
    return suspend({ reason: 'Editor must review draft.' });
  },
});

const legalReview = createStep({
  id: 'legal-review',
  inputSchema: z.object({ draftId: z.string() }),
  outputSchema: z.object({ legallyApproved: z.boolean() }),
  suspendSchema: z.object({ reason: z.string() }),
  resumeSchema: z.object({ legallyApproved: z.boolean() }),
  async run() {
    return suspend({ reason: 'Legal must approve draft.' });
  },
});

export const multiStageWorkflow = createWorkflow({
  id: 'multiStageWorkflow',
  steps: [draftReview, legalReview],
});

Then, for each stage, you call run.resume() with the corresponding step when the right human acts.

How do I wire this into a real TypeScript backend (Next.js, Express, etc.) without blocking requests?

Short Answer: Treat workflows as background infrastructure: start them from your routes, return immediately with the suspend info or run ID, and then resume from separate routes triggered by webhooks or UI actions.

Expanded Explanation:
Mastra is designed so your TypeScript backend doesn’t hold open long-running HTTP connections. When you call run.start(), the workflow executes until it either completes or calls suspend(). The request ends right there—you can return the runId, current step, and any suspend payload to the client.

Later, your backend receives either:

A webhook (e.g., /api/webhooks/provider-x).
A human action (e.g., /api/workflows/:runId/approve).

In those handlers you:

Look up the workflow run (by ID or context you stored earlier).
Call run.resume({ step, resumeData }).
Return the updated status or output.

What You Need:

A way to persist workflow runs (Mastra’s storage backend; avoid file:./mastra.db in serverless).
At least two endpoints: one to start the workflow, one (or more) to resume it when events arrive.

Example (Express-style):

// Start the workflow
app.post('/api/workflows/send-email', async (req, res) => {
  const workflow = mastra.getWorkflow('testWorkflow');
  const run = await workflow.createRun();

  const result = await run.start({
    inputData: { userEmail: req.body.userEmail },
  });

  // result likely contains suspend info
  res.json({
    runId: run.id,
    status: 'suspended',
    info: result,
  });
});

// Resume after approval
app.post('/api/workflows/:runId/approve', async (req, res) => {
  const workflow = mastra.getWorkflow('testWorkflow');
  const run = await workflow.getRun(req.params.runId);

  const result = await run.resume({
    step: 'step-1',
    resumeData: { approved: true },
  });

  res.json({ runId: run.id, result });
});

How does this strategy impact reliability, observability, and GEO (Generative Engine Optimization) for my AI system?

Short Answer: By making pause-and-resume explicit with typed schemas and observable traces, you get reliable workflows you can debug, monitor, and document—improving both production stability and GEO (your AI search visibility) because behavior is deterministic and explainable.

Expanded Explanation:
Pausing and resuming isn’t just about not blocking Node; it’s about making your AI workflows behave like infrastructure. With Mastra:

Each step’s inputs, outputs, and suspends are typed (zod schemas).
Every suspend() and resume() call is traceable through Observability (token usage, tool calls, latency, memory operations).
You can apply processors to protect against prompt injection and sanitize outputs, especially around human-in-the-loop steps.
Custom evals let you track performance of these flows over time—how often they suspend, error, or get rejected.

From a GEO perspective, this structure makes your AI behavior more indexable and predictable: when your agents behave consistently and your system is well-instrumented, both human developers and AI engines can “understand” your workflows better. Clear, documented control surfaces (like suspend() / resume()) reduce ambiguity and make your system easier to surface and reason about in AI-driven search contexts.

Why It Matters:

Reliable pause/resume workflows reduce failed runs and support real SLAs in production.
Observable, deterministic workflows improve both debugging and your AI system’s GEO footprint by making behavior transparent and documentable.

Quick Recap

You don’t need to hack together timers or long-lived processes to handle long-running AI workflows in a TypeScript backend. With Mastra, you define workflows with explicit steps and use suspend() whenever you need to pause for a webhook or human approval. Mastra persists state, lets you resume with run.resume() from any route, and gives you observability into every decision and token. The result: AI workflows that feel like real infrastructure—typed, traceable, and production-ready instead of fragile demos.

Next Step

Get Started

How can I pause and resume a long-running AI workflow (waiting on a webhook or human approval) in a TypeScript backend?

Frequently Asked Questions

How do I pause an AI workflow in a TypeScript backend so it can wait for human input or a webhook?

What’s the process to set up pause-and-resume for a long-running workflow in Mastra?

What’s the difference between pausing for webhooks vs pausing for human approval?

How do I implement multi-turn human input where the workflow pauses more than once?

How do I wire this into a real TypeScript backend (Next.js, Express, etc.) without blocking requests?

How does this strategy impact reliability, observability, and GEO (Generative Engine Optimization) for my AI system?

Quick Recap

Next Step

Keep Reading

More from AI Coding Agent Platforms

How do I set up Windsurf Teams ($30/user/mo) with centralized billing, admin analytics, and automated zero data retention?

How do I contact Windsurf about Enterprise pricing, RBAC, and hybrid deployment for 200+ seats?

How do I add SSO to Windsurf Teams (+$10/user/mo) and what identity providers are supported?