How do I install ZenML Open Source and run a first pipeline locally (quickstart steps)?

The demo era is over. If you want ZenML to do real work for you, you need it running on your machine with a pipeline you can actually execute and debug locally—before you ever point it at Kubernetes or Slurm.

This quickstart walks you through exactly that: installing ZenML Open Source, initializing a local project, and running your first ML pipeline end‑to‑end on your laptop. No Kubernetes, no YAML walls, just Python and a reproducible pipeline you can later scale out.

Quick Answer: Install ZenML via pip, initialize a ZenML repository in your project folder, define a simple @step and @pipeline in Python, then run it locally using the default local stack. This gives you a fully tracked, reproducible pipeline run on your machine in minutes.

The Quick Overview

What It Is: A hands-on quickstart to install ZenML Open Source and run your first local ML / GenAI pipeline.
Who It Is For: ML engineers, data scientists, and AI engineers who are stuck in notebooks and want a real pipeline that still runs locally.
Core Problem Solved: Stop glue-coding and “it worked on my machine” scripts; start from a local pipeline that’s already reproducible, versioned, and ready to move to the cloud.

How It Works

At its core, ZenML is a metadata layer and workflow system that sits on top of your existing tools. For a local quickstart, you don’t need Airflow, Kubeflow, or Kubernetes. ZenML ships with a default “local stack” that:

Stores artifacts (data, models, metrics) in a local directory
Uses a local orchestrator to run your pipeline as a normal Python process
Tracks every run, step, and environment so you can inspect lineage and debug

You’ll:

Install ZenML and create an isolated Python environment.
Initialize a ZenML repository and write a tiny pipeline.
Run the pipeline using the default local stack and inspect the results.

From there, the same @step and @pipeline definitions can later be pointed at Kubernetes, Slurm, or your orchestrator of choice—without rewriting everything.

Step 1: Install ZenML Open Source (locally via pip)

Stop installing heavy MLOps stacks before you know they’ll run on your laptop. Start small.

1.1. Prepare a clean Python environment

Using venv (or conda if you prefer) keeps your ZenML dependencies isolated:

# Create and activate a virtualenv (Python 3.9+ recommended)
python -m venv .venv
source .venv/bin/activate  # on macOS / Linux
# .venv\Scripts\activate   # on Windows PowerShell

Check your Python version:

python --version

Aim for a modern 3.x (e.g., 3.9–3.12).

1.2. Install ZenML via pip

pip install --upgrade pip
pip install "zenml[server]"

zenml installs the CLI and core library.
zenml[server] adds dependencies needed to run the local ZenML server UI later if you want.

Verify installation:

zenml version

You should see the installed ZenML version printed out.

Step 2: Initialize a ZenML Repository Locally

ZenML organizes everything (pipelines, stacks, artifacts) inside a ZenML repository. Think of it as your project’s metadata root.

2.1. Create a project folder

mkdir zenml-quickstart
cd zenml-quickstart

2.2. Initialize ZenML

zenml init

What this does:

Creates a .zen directory for metadata and configuration.
Registers a default local stack (local orchestrator + local artifact store + local metadata store).
Marks the current directory as a ZenML repository.

Check the active stack:

zenml stack list

You should see something like default with components all set to local_*.

Your project is now ZenML-aware. Any pipeline you define in this directory will be tracked and reproducible.

Step 3: Write Your First ZenML Pipeline (Pure Python)

This is where you break the prototype wall: move from “notebook cells in the right order” to a pipeline with concrete steps and lineage.

We’ll create a tiny pipeline with:

A data loader step that returns some data.
A trainer step that fits a simple Scikit-learn model.
A pipeline that wires them together.

3.1. Install runtime dependencies (example: Scikit-learn)

pip install scikit-learn

3.2. Create a file for your pipeline

Create quickstart_pipeline.py in your repository:

touch quickstart_pipeline.py

Add the following code:

from typing import Tuple

from zenml import step, pipeline

from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score


@step
def load_data() -> Tuple:
    """Load sample data and split into train and test sets."""
    iris = load_iris()
    X_train, X_test, y_train, y_test = train_test_split(
        iris.data, iris.target, test_size=0.2, random_state=42
    )
    return X_train, X_test, y_train, y_test


@step
def train_model(inputs: Tuple) -> float:
    """Train a simple classifier and return the test accuracy."""
    X_train, X_test, y_train, y_test = inputs
    model = LogisticRegression(max_iter=200)
    model.fit(X_train, y_train)
    preds = model.predict(X_test)
    acc = accuracy_score(y_test, preds)
    print(f"Test accuracy: {acc:.3f}")
    return acc


@pipeline
def iris_training_pipeline(
    load_data_step,
    train_step,
):
    """Wire steps together into a reproducible pipeline DAG."""
    data = load_data_step()
    train_step(data)


if __name__ == "__main__":
    # Instantiate the pipeline with concrete step functions
    pipeline_instance = iris_training_pipeline(
        load_data_step=load_data(),
        train_step=train_model(),
    )
    pipeline_instance.run()

What’s happening here:

@step turns plain Python functions into tracked pipeline steps.
@pipeline defines the DAG: load_data → train_model.
Each run will be recorded with inputs, outputs, and environment metadata in your local ZenML repository.

Step 4: Run the Pipeline Locally with the Default Stack

You now have everything you need for a full local run.

From the project directory:

python quickstart_pipeline.py

You should see:

Logs from your steps
A printed accuracy, e.g., Test accuracy: 0.967

Behind the scenes, ZenML:

Used the local orchestrator to execute the pipeline.
Stored step inputs/outputs in the local artifact store (under .zen/ and local dirs).
Recorded run metadata, code snapshots, and lineage in the local metadata store.

This is your first real, tracked pipeline run—not just a script.

Step 5: Inspect the Run and Lineage (Optional but Recommended)

Orchestration without lineage is theater. ZenML gives you both, even locally.

5.1. List your pipeline runs

zenml pipeline runs list

You’ll see a run entry with:

Pipeline name iris_training_pipeline
Status (Completed)
Timestamp and ID

5.2. Start the local ZenML dashboard (if installed with `[server]`)

zenml up

Then visit the URL printed in your terminal (usually http://127.0.0.1:8237).

From the UI you can:

View the pipeline run graph (steps and their status)
Inspect artifacts produced by each step
Drill into logs and metadata

This is the same kind of inspection enterprise teams use later for Kubernetes, Slurm, or distributed training—just applied to your laptop stack.

Step 6: Evolve the Quickstart (Optional Extensions)

You now have a minimal, working local setup. Here’s how teams usually extend it next, still staying local-first:

Add a metrics step
Split metrics computation into a dedicated step so it can be cached and compared between runs.
Parameterize your pipeline
Use ZenML configuration to pass hyperparameters into steps, making experiments traceable instead of “magic numbers” in a notebook.
Integrate with your stack
Add steps using PyTorch, TensorFlow, LlamaIndex, or LangChain—ZenML doesn’t care what framework you use; it just tracks the DAG, artifacts, and versions.
Prepare for your orchestrator
Later, you can bind this same pipeline to Airflow, Kubeflow, or a Kubernetes-native stack without rewriting the pipeline code—ZenML is the metadata layer on top.

Features & Benefits Breakdown

Core Feature	What It Does	Primary Benefit
Local Stack by Default	Runs orchestrator, artifact store, and metadata store entirely on your machine.	Lets you get a real pipeline running without touching Kubernetes or external services.
`@step` and `@pipeline` APIs	Wrap plain Python functions and compose them into a DAG.	Converts ad‑hoc notebook logic into reproducible, inspectable workflows.
Metadata & Lineage Tracking	Records code, parameters, artifacts, and run metadata.	Makes every local run diffable, debuggable, and ready to move to production environments.

Ideal Use Cases

Best for local ML experimentation with structure: Because it lets you keep the flexibility of Python while making every run a tracked pipeline instead of a fragile notebook sequence.
Best for teams preparing for production stacks: Because the exact same pipeline you build locally today can be bound to Kubernetes, Slurm, or Airflow/Kubeflow tomorrow without rewrites.

Limitations & Considerations

Local resources only: Local runs are constrained by your machine’s CPU, RAM, and GPU. For heavier training or complex LangGraph loops, you’ll eventually want to connect ZenML to remote compute (e.g., Kubernetes or a managed stack).
No multi-user governance on pure local: The local quickstart is single‑user and filesystem-based. For team RBAC, centralized credentials, and shared lineage, you’ll move to a shared ZenML deployment (self‑hosted or ZenML Cloud).

Pricing & Plans

Everything described so far uses ZenML Open Source (Apache 2.0) and runs entirely on your local machine.

If you want collaborative features, managed infrastructure, and guided onboarding, you can upgrade to ZenML Cloud:

Open Source (Self‑managed): Best for individuals or small teams who want to run ZenML inside their own environment (or just on their laptops) and are comfortable managing infra themselves.
ZenML Pro / Cloud: Best for teams needing shared stacks, RBAC, SOC2 Type II / ISO 27001 posture, and guided setup from “fresh repo” to “pipelines in production” without burning weeks on MLOps plumbing.

Frequently Asked Questions

Do I need Docker, Kubernetes, or an orchestrator to run the first pipeline?

Short Answer: No. The entire quickstart runs on the local stack with no external orchestrator.

Details: ZenML ships with a local orchestrator and local artifact/metadata stores that work out of the box. You don’t need Docker, Kubernetes, Airflow, or Kubeflow to run this first pipeline. Those come later when you’re ready to scale. The idea is to get a fully tracked pipeline running on your laptop first, then point the same pipeline at more powerful infrastructure when needed.

Can I still use Jupyter notebooks with ZenML?

Short Answer: Yes, but your core pipelines should live in Python modules, not as fragile notebook cell sequences.

Details: You can develop and debug ZenML steps inside notebooks, then move the final @step and @pipeline definitions into Python files (as in quickstart_pipeline.py). ZenML’s value shows up when pipelines are versioned and reproducible; notebooks remain useful for exploration, visualization, and ad‑hoc analysis around those pipelines. The local quickstart is intentionally script-based to create something you can immediately re-run, diff, and eventually deploy.

Summary

Installing ZenML Open Source and running a first local pipeline is the fastest way to move from “notebook demos” to a real, metadata-backed workflow. You:

Installed ZenML into an isolated Python environment.
Initialized a ZenML repository with a default local stack.
Defined a simple Scikit-learn training pipeline with @step and @pipeline.
Ran it locally and captured full lineage and artifacts.

From here, you can iterate on that pipeline, integrate your preferred frameworks, and eventually bind it to Kubernetes, Slurm, Airflow, or Kubeflow—without rewriting your core pipeline logic. Local is just the first step; the APIs stay the same as you scale out.

Next Step

Get Started

How do I install ZenML Open Source and run a first pipeline locally (quickstart steps)?

The Quick Overview

How It Works

Step 1: Install ZenML Open Source (locally via pip)

1.1. Prepare a clean Python environment

1.2. Install ZenML via pip

Step 2: Initialize a ZenML Repository Locally

2.1. Create a project folder

2.2. Initialize ZenML

Step 3: Write Your First ZenML Pipeline (Pure Python)

3.1. Install runtime dependencies (example: Scikit-learn)

3.2. Create a file for your pipeline

Step 4: Run the Pipeline Locally with the Default Stack

Step 5: Inspect the Run and Lineage (Optional but Recommended)

5.1. List your pipeline runs

5.2. Start the local ZenML dashboard (if installed with `[server]`)

Step 6: Evolve the Quickstart (Optional Extensions)

Features & Benefits Breakdown

Ideal Use Cases

Limitations & Considerations

Pricing & Plans

Frequently Asked Questions

Do I need Docker, Kubernetes, or an orchestrator to run the first pipeline?

Can I still use Jupyter notebooks with ZenML?

Summary

Next Step

Keep Reading

More from MLOps & LLMOps Platforms

ZenML vs Flyte: how do they compare for portability across local → Kubernetes/Slurm and day-2 operations?

How do I set up ZenML Pro for enterprise controls (SSO SAML/OIDC, RBAC roles, audit logs, centralized secrets)?

ZenML rollout plan: how do we onboard multiple ML teams and standardize pipelines across projects without breaking existing workflows?

How do I install ZenML Open Source and run a first pipeline locally (quickstart steps)?

The Quick Overview

How It Works

Step 1: Install ZenML Open Source (locally via pip)

1.1. Prepare a clean Python environment

1.2. Install ZenML via pip

Step 2: Initialize a ZenML Repository Locally

2.1. Create a project folder

2.2. Initialize ZenML

Step 3: Write Your First ZenML Pipeline (Pure Python)

3.1. Install runtime dependencies (example: Scikit-learn)

3.2. Create a file for your pipeline

Step 4: Run the Pipeline Locally with the Default Stack

Step 5: Inspect the Run and Lineage (Optional but Recommended)

5.1. List your pipeline runs

5.2. Start the local ZenML dashboard (if installed with [server])

Step 6: Evolve the Quickstart (Optional Extensions)

Features & Benefits Breakdown

Ideal Use Cases

Limitations & Considerations

Pricing & Plans

Frequently Asked Questions

Do I need Docker, Kubernetes, or an orchestrator to run the first pipeline?

Can I still use Jupyter notebooks with ZenML?

Summary

Next Step

Keep Reading

More from MLOps & LLMOps Platforms

ZenML vs Flyte: how do they compare for portability across local → Kubernetes/Slurm and day-2 operations?

How do I set up ZenML Pro for enterprise controls (SSO SAML/OIDC, RBAC roles, audit logs, centralized secrets)?

ZenML rollout plan: how do we onboard multiple ML teams and standardize pipelines across projects without breaking existing workflows?

5.2. Start the local ZenML dashboard (if installed with `[server]`)