How do I schedule an Apify Actor to run daily and send results to a webhook (or n8n/Zapier)?
RAG Retrieval & Web Search APIs

How do I schedule an Apify Actor to run daily and send results to a webhook (or n8n/Zapier)?

13 min read

Most teams hit the same wall with web data: you finally have an Apify Actor that extracts exactly what you need, but someone still has to click “Run” every day and manually move the dataset into n8n, Zapier, or a custom webhook. The good news is that Apify’s scheduling and webhooks are built exactly for this “run daily and send results somewhere else” workflow.

This guide walks through how to:

  • Schedule an Actor to run daily (or at any cron-like interval)
  • Send run results to a webhook URL
  • Wire that webhook into n8n or Zapier as a trigger for downstream automations

All of this works whether you’re using a Store Actor (e.g., Google Maps Scraper) or your own internal Actor.


The Quick Overview

  • What It Is: A repeatable, daily pipeline that runs an Apify Actor in the cloud and pushes its results to a webhook, n8n, or Zapier for further processing.
  • Who It Is For: Developers, data teams, and automation builders who rely on Apify datasets and don’t want to babysit manual runs or exports.
  • Core Problem Solved: Automating the “run → wait → export → hand off to another tool” loop so web data continually feeds your workflows and AI pipelines.

How It Works

At a high level, you’ll combine two Apify primitives: schedules and webhooks.

  • A schedule tells Apify: “Run this Actor at 07:00 every day.”
  • A webhook tells Apify: “When the run finishes (or fails), call this URL with run details and dataset info.”

Your automation tool (custom webhook, n8n, Zapier) receives the webhook payload, then:

  1. Reads the Actor run metadata (including dataset ID).
  2. Fetches the dataset via Apify API (JSON, CSV, etc.).
  3. Processes or routes the data (to a database, Google Sheets, Slack, your AI/RAG pipeline, etc.).

Typical three-phase lifecycle

  1. Configure and schedule the Actor:

    • Choose or build an Actor in Apify Console.
    • Configure its input (search query, URL list, date range, etc.).
    • Create a schedule to run daily at your desired time.
  2. Attach a webhook for run completion:

    • Create a webhook in Apify that triggers on ACTOR.RUN.SUCCEEDED (and optionally ACTOR.RUN.FAILED).
    • Point it to your custom webhook URL, n8n Webhook node, or Zapier Catch Hook.
    • Use the webhook payload to get the run ID and dataset ID.
  3. Consume and process the dataset in your tool:

    • Your endpoint (or n8n/Zapier workflow) receives the webhook.
    • It calls Apify API to download the dataset.
    • It writes the data to your system of record or kicks off downstream jobs.

Features & Benefits Breakdown

Core FeatureWhat It DoesPrimary Benefit
SchedulesRun any Actor on a fixed interval (e.g., daily at 07:00 UTC) with predefined input and settings.Keeps web data fresh for dashboards, AI/RAG indexes, and automations without manual intervention.
WebhooksSends HTTP requests when Actor events (run started, succeeded, failed) occur, with full run metadata.Push-based integration: your tools react instantly to new data, no polling required.
Datasets & API AccessStores Actor results in a dataset you can export or access via HTTP, Python, JavaScript, CLI, OpenAPI, or MCP clients.Standard contract between your scraper and downstream systems; easy to swap Actors without breaking pipelines.

Step 1: Schedule an Apify Actor to run daily

You can do this entirely in the Apify Console, or programmatically via API. Let’s start with the UI.

Option A: Schedule via Apify Console

  1. Open your Actor

    • Go to Apify Console.
    • Open the Actor you want to schedule (Store Actor or your own).
  2. Prepare a “template run”

    • Click Run.
    • Fill in the Input with the configuration you want repeated daily (URLs, queries, date filters, etc.).
    • Optionally adjust memory, timeout, or other run options.
    • Click Save input configuration (or use a named input, if you prefer template inputs).
  3. Create a schedule

    • In the left sidebar, go to Schedules.
    • Click Create new (or “+ New schedule”).
    • Set:
      • Name: e.g., Daily product scraper
      • Target: select Actor and choose your Actor.
      • Input: pick the stored input configuration from step 2.
    • In the Timing section:
      • Choose CRON expression or Simple interval.
      • For “once per day at a fixed time,” you can set a CRON like 0 7 * * * (runs at 07:00 UTC every day).
    • Click Create.

Now, your Actor will run daily with the same input. Each run produces a dataset (and logs) under that schedule’s runs.

Option B: Schedule via Apify API (Python example)

If you prefer infra-as-code, you can create a schedule using the Apify API. Roughly:

from apify_client import ApifyClient

client = ApifyClient("<YOUR_API_TOKEN>")

schedule = client.schedules().create(
    {
        "name": "daily-product-scraper",
        "cronExpression": "0 7 * * *",  # every day at 07:00 UTC
        "isEnabled": True,
        "isExclusive": True,
        "actions": [
            {
                "type": "RUN_ACTOR",
                "actorId": "your-username/your-actor",
                "runInput": {
                    # same input you’d provide in Console
                    "startUrls": ["https://example.com/products"],
                    "maxPages": 10,
                },
            }
        ],
    }
)

You can also manage schedules via JavaScript, raw HTTP, or the CLI; the structure is the same: a cronExpression and an actions array.


Step 2: Configure a webhook to fire when the run finishes

The schedule makes runs happen automatically. Next, you want Apify to push information about each run to your endpoint, n8n, or Zapier.

What events to use

For most “daily export” flows, you care about:

  • ACTOR.RUN.SUCCEEDED — run finished, dataset is ready to fetch.
  • ACTOR.RUN.FAILED — optional but useful for alerts.

Each webhook payload includes:

  • resource.id — the run ID
  • resource.defaultDatasetId — the dataset containing your scraped results
  • Run status, Actor ID, startedAt/finishedAt timestamps, and more

Creating a webhook in the Apify Console

  1. Open Webhooks

    • In Apify Console, go to Integrations → Webhooks.
    • Click Create webhook.
  2. Set core properties

    • Name: Daily scraper completed → n8n
    • Event type(s):
      • Select ACTOR.RUN.SUCCEEDED (and optionally ACTOR.RUN.FAILED).
    • Actor / Task / Source filter:
      • Set it to your specific Actor (or task) so only relevant runs trigger the webhook.
    • URL: your webhook endpoint:
      • For custom backend: https://your-api.example.com/webhooks/apify-daily-scraper
      • For n8n: the URL from your Webhook node
      • For Zapier: the “Catch Hook” URL from Webhooks by Zapier
  3. Payload options

    • Use the default payload first. It includes run metadata and dataset ID.
    • If you want a lean payload, you can customize it later, but starting with full metadata makes debugging easier.
  4. Save the webhook

    • Click Create (or Save).
    • Run the Actor once manually to verify your endpoint receives a payload.

From now on, every scheduled run that matches the webhook’s filters will send an HTTP request to your endpoint when it succeeds.


Step 3: Fetch and process the dataset in your automation (webhook/n8n/Zapier)

Once your endpoint, n8n, or Zapier receives the webhook, the pattern is:

  1. Parse the JSON payload.
  2. Grab resource.defaultDatasetId.
  3. Call the Apify dataset API to download results.
  4. Continue with your workflow: store, enrich, push to AI, notify, etc.

Example: Custom webhook handler

In a Node.js/Express backend, your handler could look like this:

import express from 'express';
import fetch from 'node-fetch';

const app = express();
app.use(express.json());

app.post('/webhooks/apify-daily-scraper', async (req, res) => {
  const event = req.body;

  // Basic verification (optional: check secret, IP range, etc.)
  if (!event || !event.type || !event.resource) {
    return res.status(400).send('Invalid payload');
  }

  // Only react to successful runs
  if (event.type !== 'ACTOR.RUN.SUCCEEDED') {
    return res.status(200).send('Ignored');
  }

  const datasetId = event.resource.defaultDatasetId;
  if (!datasetId) {
    console.error('No dataset ID in event', event);
    return res.status(200).send('No dataset');
  }

  try {
    // Fetch dataset as JSON
    const datasetUrl = `https://api.apify.com/v2/datasets/${datasetId}/items?format=json&clean=true`;
    const response = await fetch(datasetUrl);
    const items = await response.json();

    // TODO: Write items to your DB, send to queue, process in your AI pipeline, etc.
    console.log(`Received ${items.length} items from Apify dataset ${datasetId}`);

    res.status(200).send('OK');
  } catch (err) {
    console.error('Error fetching dataset', err);
    res.status(500).send('Error');
  }
});

app.listen(3000, () => {
  console.log('Webhook listener running on port 3000');
});

A few notes:

  • The dataset URL uses clean=true so HTML is stripped where appropriate (nice for AI pipelines).
  • You don’t need the API token to read public datasets. For private datasets, add your Apify API token as a query parameter or header.
  • You can change format=json to csv, xlsx, etc., depending on what your downstream tool expects.

Example: Using n8n with an Apify daily schedule

n8n pairs well with Apify when you want a visual, monitored pipeline that fans out into multiple destinations (DB + Sheets + Slack, etc.).

1. Create a Webhook trigger in n8n

  • Add Webhook as the first node in your workflow.
  • Set the HTTP Method to POST.
  • Copy the Test URL or Production URL (n8n shows you both).
  • Paste that URL into your Apify webhook’s URL field (from Step 2).

2. Test the integration

  • In n8n, click Execute workflow to start listening.
  • In Apify Console, manually Run your Actor (the same one hooked to the webhook).
  • n8n should receive the POST; the Webhook node’s output will contain the Apify event payload.

You’ll see something like:

{
  "body": {
    "type": "ACTOR.RUN.SUCCEEDED",
    "resource": {
      "id": "abcd1234",
      "actId": "your-username/your-actor",
      "defaultDatasetId": "def5678",
      "status": "SUCCEEDED",
      "startedAt": "2024-06-01T07:00:00.000Z",
      "finishedAt": "2024-06-01T07:02:15.000Z"
    }
  }
}

3. Add an HTTP Request node to fetch the dataset

  • Add an HTTP Request node after the Webhook.

  • Configure:

    • Method: GET

    • URL: use an expression to build the dataset URL:

      https://api.apify.com/v2/datasets/{{$json["body"]["resource"]["defaultDatasetId"]}}/items?format=json&clean=true
      
  • If the dataset is private, include your Apify API token as:

    • Query parameter token=<YOUR_API_TOKEN>, or
    • Header Authorization: Bearer <YOUR_API_TOKEN>

This node’s output will now be an array of items from the dataset.

4. Continue the n8n flow

From the HTTP Request node, you can:

  • Push data into Postgres, MySQL, or MongoDB nodes.
  • Insert/update rows in Google Sheets.
  • Send notifications via Slack, Email, or Teams.
  • Call LangChain/LlamaIndex APIs or queue another workflow that updates a vector database like Pinecone for a RAG pipeline.

Once you’re satisfied, turn the workflow Active in n8n. Now, every daily run from Apify will automatically trigger this flow.


Example: Using Zapier with an Apify daily schedule

Zapier is a good fit if you want lightweight, no-code routing into SaaS tools.

1. Create a Catch Hook in Zapier

  • In Zapier, start a new Zap.
  • For the trigger, choose: Webhooks by Zapier → Catch Hook.
  • Zapier gives you a unique URL. Copy it.

Paste this URL into your Apify webhook’s URL field (as described earlier) and save.

2. Test the hook

  • In Zapier, click Test trigger.
  • In Apify, manually run your Actor.
  • Zapier should detect a new hook; the sample payload will include resource.defaultDatasetId.

3. Fetch the dataset inside Zapier

Zapier doesn’t know about Apify datasets natively, but you can use Webhooks by Zapier → GET:

  • Add an action step: Webhooks by Zapier → GET.

  • For URL, use a custom value with a reference to the dataset ID:

    https://api.apify.com/v2/datasets/{{123456789__resource__defaultDatasetId}}/items?format=json&clean=true
    

    Replace 123456789__... with the mapped field from the trigger step.

  • If private, include your Apify token as a query parameter or header.

The response will be the dataset items (likely as a JSON array). From there, you can pipe into:

  • Google Sheets: Add rows for each item.
  • Airtable: Upsert records.
  • Slack/Email: Send summaries or alerts.
  • Webhooks: Call your own API with processed data.

Ideal Use Cases

  • Best for daily reporting and dashboards: Because the Actor schedule guarantees fresh data every morning and webhooks push it into tools like n8n, Zapier, or your BI backend without polling.
  • Best for AI and RAG pipelines: Because the Website Content Crawler or custom Actors can extract clean text daily, and your webhook/n8n flow can write that content to a vector database (e.g., Pinecone) or indexer that keeps your LLM application up-to-date.

Limitations & Considerations

  • Time zones and run duration: Schedules are defined in UTC. If your Actor sometimes takes longer than expected, plan for that in downstream workflows. For example, build idempotent consumers that can handle occasional delays or retries.
  • Webhook reliability: If your endpoint (or n8n/Zapier) is down when Apify sends the webhook, delivery may fail. Use:
    • Monitoring on your endpoint
    • Apify run logs and alerts
    • Idempotent dataset consumption (safe to re-read a dataset by ID)

Pricing & Plans

Apify pricing is based primarily on platform usage (compute units, storage, and data transfer) rather than per-schedule or per-webhook fees:

  • Schedules themselves don’t have an extra cost; you pay for the Actor runs they trigger.
  • Webhooks don’t incur a separate charge; you pay for the runs and datasets that produce those events.

Typical pattern:

  • Smaller teams / prototyping: Use the free tier or lower plans to run a few daily Actors, send data to Google Sheets or n8n, and iterate quickly.
  • Larger teams / production AI pipelines: Move to higher plans or enterprise agreements to run many Actors on strict SLAs, with 99.95% uptime and SOC2/GDPR/CCPA compliance.

For exact numbers and limits, check the current pricing on apify.com, as those evolve over time.

  • Usage-based / self-service: Best for developers and teams needing flexible, on-demand daily schedules, where a few Actors power multiple workflows.
  • Enterprise plans: Best for organizations needing many scheduled Actors, high-volume datasets, dedicated support, and integration into security/compliance workflows.

Frequently Asked Questions

Do I need to run the Actor through a “Task,” or can I schedule the Actor directly?

Short Answer: You can schedule either, but using an Actor with saved input is usually enough.

Details:
Apify supports both Actor and Task as schedule targets. Tasks are handy if you want multiple named configurations for the same Actor (e.g., “Scrape US prices” vs. “Scrape EU prices”). For a single daily job, scheduling the Actor with a saved input configuration is simpler. Webhooks work with both—just ensure the webhook is configured to trigger on the correct Actor or task.


Can I send the dataset itself directly in the webhook body instead of fetching it via API?

Short Answer: Not realistically for non-trivial datasets; it’s better to send metadata and fetch via API.

Details:
Apify webhooks are designed to send run metadata (run ID, Actor ID, dataset ID, status, timestamps). For anything beyond tiny outputs, embedding the whole dataset in the webhook body is impractical and brittle (size limits, retries, parsing). The robust pattern is:

  1. Webhook sends metadata (including defaultDatasetId).
  2. Your endpoint/n8n/Zapier uses the dataset API to fetch the data in the format it wants.
  3. You gain control over pagination, filtering, and retry logic.

Summary

To run an Apify Actor daily and send the results to a webhook, n8n, or Zapier, you combine three pieces:

  • A schedule that runs the Actor every day with fixed input.
  • A webhook that fires on ACTOR.RUN.SUCCEEDED and sends run metadata (including dataset ID) to your endpoint.
  • A consumer workflow (custom code, n8n, Zapier) that uses the dataset ID to fetch results via the Apify API and route them into your systems, dashboards, or AI pipelines.

Once set up, this turns “we need fresh web data every day” into a hands-off pipeline: monitored runs, structured datasets, and push-based integrations that keep your tools and models up to date.


Next Step

Get Started