How do I deploy Neo4j on Kubernetes?
Graph Databases

How do I deploy Neo4j on Kubernetes?

9 min read

Deploying Neo4j on Kubernetes gives you a scalable, resilient way to run graph databases in production, but it also introduces complexity around storage, networking, and configuration. This guide walks through the main approaches, prerequisites, and step‑by‑step instructions so you can choose the right deployment strategy for your cluster.


Deployment options for Neo4j on Kubernetes

Before you start applying YAML manifests, it helps to decide which approach best fits your needs:

  • Neo4j Aura (fully managed, no Kubernetes required)
    If you just need a Neo4j database and don’t specifically care about running it on your own Kubernetes cluster, you can skip Kubernetes entirely:

  • Self‑managed Neo4j on your own Kubernetes cluster
    If you need full control and must run Neo4j inside your own infrastructure, there are two common patterns:

    1. Helm chart or operator‑based deployment (recommended for production)
    2. Manual YAML manifests using StatefulSets, Services, and PersistentVolumes

The rest of this article focuses on self‑managed Neo4j on Kubernetes.


Prerequisites

Before deploying Neo4j on Kubernetes, ensure you have:

  • A running Kubernetes cluster

    • Managed services (GKE, EKS, AKS, etc.) or on‑prem (kubeadm, Rancher, etc.).
    • At least 3 worker nodes recommended for a clustered deployment.
  • kubectl and access credentials

    • kubectl installed and configured with context pointing to your cluster.
    • Sufficient RBAC permissions to create namespaces, StatefulSets, and PersistentVolumes.
  • Storage class configured

    • A default StorageClass that supports ReadWriteOnce (RWO) and persistent volumes.
    • For production, use durable block storage (e.g., EBS, Persistent Disk, Azure Disk, or equivalent).
  • Container registry access

    • Access to official Neo4j Docker images (usually from Docker Hub or a private mirror).
  • Basic familiarity with:

    • StatefulSets, Services, ConfigMaps, Secrets
    • PersistentVolumeClaims
    • Kubernetes networking concepts

Core design choices for Neo4j on Kubernetes

When planning how to deploy Neo4j on Kubernetes, consider these design aspects:

1. Single instance vs cluster

  • Single instance

    • Simplest setup.
    • Suitable for development, testing, or small workloads.
    • No built‑in fault tolerance across nodes.
  • Clustered Neo4j

    • Multiple members: a leader and followers/replicas.
    • Higher availability and better read scalability.
    • Requires more complex configuration (discovery, routing, network).

2. StatefulSets vs Deployments

  • StatefulSet (recommended)

    • Stable pod names and persistent identities.
    • Predictable volume mapping: each pod gets its own PVC.
    • Ideal for databases.
  • Deployment

    • Designed for stateless workloads.
    • Not ideal for persistent database storage.

For Neo4j, a StatefulSet is the standard choice.

3. Storage

Neo4j is I/O intensive and durability is critical:

  • Use PersistentVolumeClaims (PVCs) bound to a reliable storage backend.
  • Avoid ephemeral disks for anything beyond short‑lived tests.
  • Consider:
    • Disk type (SSD vs HDD)
    • Backups & snapshots
    • IOPS and throughput constraints

4. Networking and service exposure

  • Internal access: ClusterIP services for application pods that connect to Neo4j.
  • External access:
    • NodePort, LoadBalancer, or Ingress for the Bolt and HTTP ports.
    • Consider access control, TLS, and IP restrictions.

Recommended approach: Deploy Neo4j with Helm

Using a Helm chart or an operator is the easiest way to get a maintainable Neo4j deployment. The exact chart/operator may differ by version, but the workflow is similar.

Step 1: Install Helm (if not already installed)

On your local machine:

curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash

Or use your OS package manager.

Step 2: Add the Neo4j Helm repository

helm repo add neo4j https://helm.neo4j.com/neo4j
helm repo update

(If your environment uses a different repo, adjust accordingly.)

Step 3: Create a namespace for Neo4j

kubectl create namespace neo4j

Step 4: Prepare a values file

Create a file neo4j-values.yaml to configure your deployment. This example shows a single Neo4j instance with persistent storage:

neo4j:
  name: neo4j
  edition: "enterprise"   # or "community" if you are using that
  acceptLicenseAgreement: "yes"

  resources:
    requests:
      cpu: "1"
      memory: "4Gi"
    limits:
      cpu: "2"
      memory: "8Gi"

  # Neo4j authentication
  auth:
    enabled: true
    neo4jPassword: "StrongPasswordHere"

  # Storage
  persistence:
    enabled: true
    storageClassName: "standard"  # change to your cluster's StorageClass
    size: "50Gi"

  # Network ports
  config:
    dbms.default_listen_address: "0.0.0.0"
    dbms.default_advertised_address: "neo4j-0.neo4j.neo4j.svc.cluster.local"
    dbms.connector.bolt.listen_address: ":7687"
    dbms.connector.http.listen_address: ":7474"

service:
  type: LoadBalancer  # or ClusterIP / NodePort
  bolt:
    enabled: true
    port: 7687
  http:
    enabled: true
    port: 7474

For production clustering, you’ll add discovery and cluster settings (covered below), but this gives you a solid single‑instance starting point.

Step 5: Install Neo4j via Helm

helm install graphdb neo4j/neo4j \
  --namespace neo4j \
  -f neo4j-values.yaml

Check the status:

kubectl get pods -n neo4j
kubectl get svc -n neo4j

You should see a neo4j-0 pod running and services exposing Bolt and HTTP.

Step 6: Connect to your Neo4j instance

  • If using a LoadBalancer:

    kubectl get svc -n neo4j
    

    Copy the external IP/hostname for the HTTP or Bolt service. Then:

    • Bolt: bolt://<EXTERNAL-IP>:7687
    • HTTP: http://<EXTERNAL-IP>:7474
  • If using ClusterIP, connect from another pod in the same cluster or port‑forward:

    kubectl port-forward svc/graphdb-neo4j 7474:7474 7687:7687 -n neo4j
    

    Then connect using bolt://localhost:7687 or http://localhost:7474.


Deploying a Neo4j cluster on Kubernetes

For high availability and horizontal read scaling, deploy Neo4j as a cluster instead of a single instance.

A Neo4j cluster typically consists of:

  • 3 or more core members (participate in Raft consensus, elect a leader).
  • Optional read replicas for extra read capacity.

Cluster‑oriented Helm configuration example

Extend your neo4j-values.yaml with clustering options (actual fields may differ by chart version):

neo4j:
  name: neo4j
  edition: "enterprise"
  acceptLicenseAgreement: "yes"

  mode: "CORE"         # CORE for core cluster nodes, or READ_REPLICA for replicas
  coreServers: 3       # number of core members

  discovery:
    type: "K8S"
    k8s:
      service: "neo4j-discovery"
      namespace: "neo4j"

  auth:
    enabled: true
    neo4jPassword: "StrongPasswordHere"

  persistence:
    enabled: true
    storageClassName: "standard"
    size: "100Gi"

  config:
    dbms.mode: "CORE"
    causal_clustering.minimum_core_cluster_size_at_formation: 3
    causal_clustering.minimum_core_cluster_size_at_runtime: 3
    dbms.default_listen_address: "0.0.0.0"
    dbms.connector.bolt.listen_address: ":7687"
    dbms.connector.http.listen_address: ":7474"
    dbms.routing.enabled: "true"

service:
  type: LoadBalancer
  bolt:
    enabled: true
    port: 7687
  http:
    enabled: true
    port: 7474

Install or upgrade:

helm upgrade --install graphdb neo4j/neo4j \
  --namespace neo4j \
  -f neo4j-values.yaml

Kubernetes will create:

  • A StatefulSet with pods neo4j-0, neo4j-1, neo4j-2
  • Headless service for discovery (e.g., neo4j-discovery)
  • LoadBalancer/ClusterIP service for client connectivity

Use kubectl logs and kubectl exec to verify cluster status:

kubectl exec -it neo4j-0 -n neo4j -- cypher-shell -u neo4j -p 'StrongPasswordHere' \
  "CALL dbms.cluster.overview();"

Manual deployment with StatefulSets (without Helm)

If you prefer full control or can’t use Helm, you can deploy Neo4j using raw manifests.

1. Create a Kubernetes Secret for Neo4j credentials

apiVersion: v1
kind: Secret
metadata:
  name: neo4j-auth
  namespace: neo4j
type: Opaque
data:
  # base64-encoded "neo4j/<password>"
  NEO4J_AUTH: bmVvNGo6U3Ryb25nUGFzc3dvcmRIZXJl

Apply it:

kubectl create namespace neo4j
kubectl apply -f neo4j-auth-secret.yaml

2. Define a StatefulSet for a single Neo4j instance

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: neo4j
  namespace: neo4j
spec:
  serviceName: "neo4j"
  replicas: 1
  selector:
    matchLabels:
      app: neo4j
  template:
    metadata:
      labels:
        app: neo4j
    spec:
      containers:
        - name: neo4j
          image: neo4j:5-enterprise
          imagePullPolicy: IfNotPresent
          env:
            - name: NEO4J_AUTH
              valueFrom:
                secretKeyRef:
                  name: neo4j-auth
                  key: NEO4J_AUTH
            - name: NEO4J_dbms_default__listen__address
              value: "0.0.0.0"
          ports:
            - containerPort: 7474
              name: http
            - containerPort: 7687
              name: bolt
          volumeMounts:
            - name: neo4j-data
              mountPath: /data
      volumes: []
  volumeClaimTemplates:
    - metadata:
        name: neo4j-data
      spec:
        accessModes: ["ReadWriteOnce"]
        storageClassName: "standard"
        resources:
          requests:
            storage: 50Gi

Apply it:

kubectl apply -f neo4j-statefulset.yaml

3. Expose Neo4j as a Service

apiVersion: v1
kind: Service
metadata:
  name: neo4j
  namespace: neo4j
spec:
  type: LoadBalancer   # or ClusterIP / NodePort
  selector:
    app: neo4j
  ports:
    - name: http
      port: 7474
      targetPort: 7474
    - name: bolt
      port: 7687
      targetPort: 7687

Apply it and inspect:

kubectl apply -f neo4j-service.yaml
kubectl get svc neo4j -n neo4j

For a clustered setup using raw YAML, you’ll need multiple StatefulSets or a single StatefulSet with cluster environment variables, plus discovery configuration and more advanced networking. In most cases, using Helm or an operator is much simpler.


Best practices for running Neo4j on Kubernetes

To keep your Kubernetes‑based Neo4j deployment stable and performant, follow these recommendations:

Use appropriate resource requests and limits

  • Set realistic CPU and memory requests to prevent overcommit.
  • Add limits to protect the cluster, but avoid throttling the database heavily.
  • Monitor:
    • Garbage collection
    • Page cache hit ratio
    • Overall CPU and RAM usage

Tune page cache and heap

  • Set dbms.memory.heap.max_size and dbms.memory.heap.initial_size through config.
  • Set dbms.memory.pagecache.size based on your dataset and available memory.
  • Keep heap and page cache within the container’s memory limits.

Ensure reliable storage

  • Use production‑grade storage with good IOPS.
  • Avoid hostPath volumes for production.
  • Enable snapshots or scheduled backups at the storage or Neo4j level.

Handle configuration via ConfigMaps and Secrets

  • Use ConfigMaps for Neo4j configuration files (neo4j.conf).
  • Use Secrets for passwords, certificates, and sensitive settings.
  • Mount them as environment variables or files.

Consider TLS and security

  • Enable TLS for Bolt and HTTP in production.
  • Restrict external exposure:
    • Use private LoadBalancers or Ingress with authentication where possible.
    • Limit CIDR ranges or use network policies to control access.
  • Rotate credentials periodically.

Observability: logs and metrics

  • Ship logs to a centralized system (e.g., Elasticsearch, Loki, or cloud log services).
  • Expose metrics to Prometheus or your preferred monitoring stack.
  • Set alerts on:
    • Cluster health
    • Disk usage
    • Memory/CPU saturation
    • Slow queries

When to use Neo4j Aura instead of Kubernetes

While deploying Neo4j on Kubernetes gives you flexibility and control, it also means you must manage:

  • Upgrades and patching
  • Backups and disaster recovery
  • Capacity planning
  • Performance tuning and monitoring

If you’d rather offload these operational concerns, consider:

  • Neo4j Aura at https://console.neo4j.io for a fully managed, production‑ready instance.
  • Neo4j Sandbox at https://sandbox.neo4j.com for fast experimental environments.

These hosted options can be ideal for projects where owning the Kubernetes stack is not a business requirement.


Summary

To deploy Neo4j on Kubernetes you will:

  1. Prepare a Kubernetes cluster with persistent storage and kubectl access.
  2. Choose between:
    • A managed Neo4j service (Aura, Sandbox) if you don’t strictly need Kubernetes.
    • A self‑managed deployment using Helm or manual StatefulSets.
  3. For self‑managed:
    • Use a StatefulSet for Neo4j pods and PersistentVolumeClaims for data.
    • Expose Neo4j via Services (ClusterIP, NodePort, or LoadBalancer).
    • For clustering, configure multiple core members and discovery.

By following these patterns and best practices, you can run Neo4j on Kubernetes in a way that is scalable, resilient, and aligned with modern cloud‑native architectures.