Block Proto Fleet: how do I deploy it for my site and set up chip-level monitoring plus bulk actions?

Most teams exploring Proto today are asking the same question: how do I go from a single device on my desk to a managed fleet in production—with chip-level observability and the ability to push changes across thousands of units in one shot? The good news is that Proto was designed for that exact scale problem: turning Bitcoin hardware from a one-off device into an upgradeable, monitorable platform.

Quick Answer: To deploy Proto Fleet for your site, you provision devices into a managed fleet, connect them to your backend, and standardize a “site profile” that defines firmware, policies, and Bitcoin network settings. Chip-level monitoring comes from the device telemetry pipeline Proto exposes (health, temperature, error rates), and bulk actions are orchestrated through fleet-level operations like staged firmware rollouts, policy pushes, and remote diagnostics.

Why This Matters

For most organizations, Bitcoin hardware is no longer one device in a lab—it’s hundreds or thousands of units embedded into products, facilities, or customer environments. Without a fleet approach, every update becomes a support ticket, every bug fix a physical visit, and every hardware anomaly a guessing game.

Proto exists to avoid that trap. By treating hardware as an addressable, software-defined fleet, you can:

roll out protocol and security updates on your schedule, not on a shipping cycle
detect chip-level drift or degradation before it becomes downtime
coordinate actions across devices (e.g., tighten spending policies, rotate keys, or adjust fee strategies) in minutes, not months

For Block, this is directly tied to economic empowerment: Bitcoin infrastructure should be reliable, transparent, and safe at scale. Proto Fleet is how we turn discrete chips into a resilient, upgradeable network.

Key Benefits:

Centralized control over distributed devices: Manage configuration, firmware, and policies for every Proto chip from a single control plane instead of bespoke scripts and manual SSH sessions.
Chip-level observability and risk monitoring: Track health, performance, and security-relevant signals per device, so you can respond to anomalies before they impact customer funds.
Safe, staged bulk operations: Push updates and actions across your fleet with guardrails—canary rollouts, version pinning, and automatic rollback—so you move fast without turning your hardware into a black box.

Core Concepts & Key Points

Concept	Definition	Why it's important
Proto Fleet	A managed set of Proto-enabled devices grouped under a single control plane and API, typically aligned to a site, region, or product line.	Lets you operate Bitcoin hardware as software: consistent configuration, shared policies, and bulk operations across thousands of chips.
Site Profile	A reusable configuration for a given deployment context: Bitcoin network parameters, firmware channel, security policies, monitoring thresholds, and integration endpoints.	Standardizes deployments so every device at a site behaves the same way; new units inherit known-good settings automatically.
Chip-Level Telemetry & Actions	Fine-grained metrics and commands scoped to a single Proto chip: health, temperature, error codes, key usage, plus actions like restart, re-key, or policy update.	Enables targeted diagnostics and remediation without touching the whole fleet, and provides the data you need to make safe bulk decisions.

How It Works (Step-by-Step)

At a high level, deploying Proto Fleet for your site and enabling chip-level monitoring plus bulk actions involves five parts:

Prepare your environment and trust model
Onboard devices into a fleet and define a site profile
Wire telemetry into your observability stack
Configure chip-level monitoring and alerts
Orchestrate safe bulk actions and rollouts

Below is a simplified step-by-step that matches how we see teams adopt Proto in practice.

1. Prepare Your Environment and Trust Model

Before you claim the first chip into a fleet, lock down who can see what and who can change what.

Define your roles and permissions:
- Infrastructure / DevOps: fleet configuration, rollouts, monitoring.
- Security / Custody: key policies, spending limits, approval workflows.
- Operations / Support: read-only telemetry, limited remedial actions.
Choose your integration points:
- Monitoring: e.g., Prometheus + Grafana, Datadog, or another standard stack.
- Backend: e.g., your existing services that manage accounts, balances, or on-chain workflows.
Set your risk boundaries:
- Maximum firmware ring you’re comfortable with in production (e.g., stable vs beta).
- Required approvals for policy changes that impact signing or withdrawal behavior.

In Block’s own Bitcoin ecosystem work (Bitkey and Proto), we treat this as non-negotiable: hardware that touches value must be controlled via explicit, auditable policies, not ad hoc scripts.

2. Onboard Devices into a Fleet and Define a Site Profile

You can think of “fleet” as the logical container, and “site profile” as the configuration blueprint for that container.

Create a fleet for your site
Typically, you’ll group by:
- physical site (e.g., “North America DC-1” or “EU Retail Cluster”)
- product environment (e.g., “Consumer Wallets – Production”)
Define a site profile (template fields you’ll usually include):
- Network settings:
  - Bitcoin mainnet/testnet/regtest
  - Preferred node endpoints, fallback nodes, timeout behavior
- Firmware channel and version policy:
  - Track stable, beta, or pinned versions for Proto firmware and supporting software
  - Allowed downgrade/rollback behavior
- Security & key policies:
  - Allowed spending paths and limits
  - Required co-signers / multi-party thresholds (if applicable)
  - Rate limits for signing, address derivation, and backup operations
- Telemetry policy:
  - Which chip-level metrics are mandatory
  - Sampling frequency and retention expectations
- Access control for actions:
  - Who can push firmware updates vs who can restart a device vs who can alter signing policy
Enroll devices into the fleet
- During device provisioning, each Proto-enabled chip:
  - Presents a hardware-bound identity (attestation)
  - Is bound to your fleet and site profile via a secure registration flow
- Once enrolled, the device:
  - Pulls the assigned firmware, keys, and policies
  - Starts streaming telemetry consistent with your site profile

From this point on, new hardware destined for that site can be zero-touch: as soon as it checks in, it joins the fleet with the right configuration.

3. Wire Telemetry into Your Observability Stack

Chip-level monitoring is valuable only if it shows up where your teams already live.

Select the telemetry you care about
Typical Proto-level metrics teams pull into their monitoring stack:
- Hardware health:
  - Temperature and voltage bands
  - Error codes and fault counters
  - Reset / reboot history
- Cryptographic activity:
  - Signing frequency and latency
  - Key access patterns (e.g., abnormal spikes)
  - Failed or rejected operations
- Connectivity and sync:
  - Connection to Bitcoin nodes (latency, error rates)
  - Sync status and height divergence
- Policy and config drift:
  - Deviations from site profile
  - Outdated firmware versions
  - Disabled or degraded protection mechanisms
Connect Proto Fleet to your observability tools
- Use the telemetry endpoints (gRPC/REST/WebSocket depending on your stack) to export metrics into:
  - Prometheus / OpenTelemetry collectors
  - Datadog, New Relic, or similar SaaS platforms
- Map device IDs to your own identifiers (site, rack, customer account) so on-call teams can find the right hardware fast.
Establish baselines and dashboards
- Build dashboards that answer:
  - “How healthy is my fleet?”
  - “Which site or firmware version is generating most incidents?”
  - “Are cryptographic operations behaving within normal limits?”
- Use baselines to distinguish “this one chip is hot” from “this entire batch has a systemic issue.”

Block’s philosophy here is the same as for our internal AI agent framework goose: systems that matter should not be opaque. Telemetry needs to be rich enough that an engineer can understand and debug behavior without guessing.

4. Configure Chip-Level Monitoring and Alerts

Once telemetry flows, you can define concrete, enforceable monitors that trigger human and automated responses.

Device health alerts
- Thresholds for temperature, voltage, and error rates.
- Alert when a device flaps between online/offline, or restarts more than N times in a window.
Security-sensitive signals
- Sudden spikes in signing volume from a single device.
- Attempts to use deprecated key paths or unauthorized addresses.
- Policy downgrades or firmware rollbacks outside of a planned maintenance window.
Firmware and policy drift
- Devices on an unsupported firmware version.
- Site profile mismatches (e.g., device on testnet inside a mainnet-only site).
Automated remediation hooks
- For low-risk issues (e.g., transient connectivity problems), you might authorize automatic:
  - soft restarts
  - node endpoint failover
- For high-risk signals (e.g., suspected key misuse), alerts should:
  - immediately page the right team
  - automatically quarantine the device: freeze signing or restrict certain operations until reviewed

Chip-level monitoring is not just about “is it online?” It’s a continuous feedback loop that informs how you manage bulk actions safely.

5. Orchestrate Safe Bulk Actions and Rollouts

With a fleet defined and telemetry in place, you can treat Proto like any other production-grade infrastructure: you roll out changes with blast radius in mind.

Common bulk actions include:

Firmware rollouts
- Select a target: “all devices at Site A on version ≤ X”
- Define a rollout strategy:
  - Phase 1: canary group (e.g., 1–2% of devices)
  - Phase 2: 20–30% of the site
  - Phase 3: remaining devices
- Use chip-level telemetry to decide whether to advance, pause, or roll back:
  - If signing latency or error rates spike for the canary, stop and investigate.
Policy changes
- Bulk-update:
  - daily withdrawal limits
  - required co-signers or thresholds
  - allowed address formats or script types
- Require explicit approvals:
  - For example, dual-control where Security + Ops must both approve a fleet-wide policy change.
Configuration updates
- Change Bitcoin node endpoints across a site when migrating backends.
- Adjust fee policies in response to network conditions.
Bulk diagnostics
- Issue a “status sweep” across the fleet:
  - gather extended health data
  - validate key material and attestations
  - confirm compliance with your security baseline

To keep this safe, we recommend:

Versioned operations: Every bulk action should be associated with a versioned config or firmware artifact you can roll back to.
Scoped blast radius: Always start with a limited cohort—by site, rack, or a sampling across the fleet.
Automatic rollback conditions: Predefine metrics that trigger an automatic rollback if exceeded (e.g., error rate > X%, signing latency > Y ms).

When we applied this style of automation to our own internal tools at Block—like goose for code and infrastructure—we saw order-of-magnitude improvements: 50–75% less development time on certain projects and a 40% increase in production code shipped per engineer. The same pattern applies to Proto: disciplined automation plus tight observability yields both speed and safety.

Common Mistakes to Avoid

Treating Proto devices as one-off appliances instead of a fleet:
Without a fleet model and site profiles, you end up with “config snowflakes” that are impossible to maintain. Always group devices and standardize their behavior.
Pushing bulk updates without telemetry-driven guardrails:
Rolling out new firmware or policies to all devices at once—without canaries, baselines, and rollback criteria—is the fastest way to turn a routine update into an incident. Design your rollout strategy before you press “go.”
Ignoring governance around high-risk actions:
Allowing any engineer to modify signing policies or firmware channels breaks your risk model. Separate roles, require approvals for sensitive fleet actions, and ensure every change is auditable.

Real-World Example

Imagine you operate a network of Bitcoin-enabled kiosks in multiple countries. Each kiosk embeds a Proto chip responsible for secure key management and transaction signing. You want to:

migrate some sites from testnet to mainnet,
deploy a new firmware version that improves signing performance, and
tighten spending policies in regions with emerging fraud patterns.

Here’s how Proto Fleet helps:

Define site profiles for each region (e.g., “US-Kiosks-Mainnet,” “EU-Kiosks-Mainnet,” “APAC-Kiosks-Testnet”), specifying the Bitcoin network, firmware channel, and base security policies.
Enroll every kiosk’s Proto chip into the appropriate fleet and profile. The devices automatically align their config and start streaming telemetry.
Use telemetry dashboards to confirm that current firmware is stable: signing success rate, latency, health metrics all within expected ranges.
Roll out the new firmware:
- Start with 2% of kiosks in a single region as a canary.
- If telemetry stays stable over a defined window (e.g., 24–48 hours), expand to 25%, then 100% in that region, then replicate the same pattern in others.
Apply tighter policies in higher-risk regions:
- Bulk-update the site profile for those fleets (lower per-transaction limits, stricter co-signing requirements).
- Monitor signing activity to ensure customer experience remains acceptable.
When metrics show improved performance and stable risk posture, reuse the same profile for new deployments. Every new kiosk you ship can join the fleet and inherit a known-safe configuration on day one.

Pro Tip: Before you run your first fleet-wide update, simulate it on a shadow environment or a small, geographically isolated site. Use the same telemetry and alerting rules you’d use in production, and predefine rollback triggers. This gives you a dry run to validate your process—not just your code.

Summary

Deploying Proto Fleet for your site is fundamentally about treating Bitcoin hardware as a managed, observable, and upgradeable system—just like the rest of your infrastructure. You define fleets and site profiles, wire chip-level telemetry into your existing observability tools, and then use that visibility to drive safe, staged bulk actions.

Done well, this approach reduces on-site interventions, shortens the time between discovering a vulnerability and patching it across your fleet, and gives your teams confidence that critical signing hardware is behaving as expected. That’s how we think about Proto at Block: not as a single device, but as part of a broader ecosystem that must be open, inspectable, and operable at scale.

Next Step

Get Started