
How do I run Oxen.ai batch inference on a dataset and save the outputs to a new branch/commit?
Running batch inference on a versioned dataset and saving the outputs as a new branch/commit is exactly the kind of loop Oxen.ai is built for: pick a model, run it over your data, persist the predictions as a new artifact, and keep a reproducible history of “which model produced which outputs from which data.”
Quick Answer: Use Oxen.ai to (1) version your input dataset in a repo, (2) run batch inference with a model via the API or your own script, and (3) write the outputs (e.g., JSONL/CSV/Parquet) back into the repo on a new branch and commit. This creates a clean lineage: base data on one commit, model predictions on another, with the ability to diff, review, and roll back like code.
Why This Matters
If you’re serious about production AI, “I think we ran v3 of the model on some S3 snapshot” isn’t good enough. You need to know:
- Which dataset snapshot you used
- Which model (and version) generated the predictions
- What changed between inference runs
Running Oxen.ai batch inference on a dataset and saving the outputs to a new branch/commit gives you that lineage. You can experiment freely—try multiple models, thresholds, or prompts—without losing track of what came from where. And because everything is versioned in one place, you avoid the usual tangle of “misc_predictions_v2_final_really.csv” in random buckets.
Key Benefits:
- Reproducible experiments: Every inference run is tied to a specific dataset commit and model version, so you can recreate results later.
- Safe iteration: Use branches to try new models or parameters without polluting your main dataset until you’re ready.
- Team-wide review: Product, research, and creative stakeholders can diff, comment, and approve inference outputs like any other dataset change.
Core Concepts & Key Points
| Concept | Definition | Why it's important |
|---|---|---|
| Oxen dataset repository | A Git-like repo in Oxen.ai that versions large assets like data tables, images, and model outputs. | Keeps every dataset snapshot and inference result tied to a commit, not a loose file in S3. |
| Batch inference run | Applying a model to every row/sample in a dataset (or subset) and writing predictions to a structured output file. | Lets you propagate model improvements (or prompt changes) across entire datasets in a controlled, repeatable way. |
| Branch + commit for outputs | Creating a new branch from a base commit, adding a predictions file/column, and committing that change. | Encodes “this model run” as a discrete change set you can diff, review, and merge into main when you’re confident. |
How It Works (Step-by-Step)
At a high level, you:
- Version your input dataset in an Oxen repo.
- Run batch inference on that dataset using a model (Oxen endpoint or your own).
- Write predictions back into the repo on a new branch and commit.
Below is what that looks like in practice.
1. Prepare and version your input dataset
You want a clean, versioned starting point before you run inference.
- Create or identify the Oxen repo that holds your dataset (e.g.,
ox/your-dataset). - Ensure the data you want to run inference on is committed—typically as:
- A table (CSV/Parquet/JSONL)
- A folder of images/audio/video files plus a metadata table
- Note the commit or branch you want to base your inference run on (e.g.,
mainat commitabc123).
Example (conceptual) CLI-style flow:
# Clone the repo locally
oxen clone ox/your-dataset
cd your-dataset
# Make sure you're on the branch/commit you want to run on
oxen checkout main
oxen log # find the commit you want as the base
The key: lock in which dataset snapshot you’re about to run inference on.
2. Create a new branch for the inference outputs
Treat each major inference run as a branch. That way you can:
- Run multiple models in parallel (e.g.,
gpt4-labeling,llama3-labeling) - Keep main clean until you’ve validated the outputs
# Create a new branch for this inference run
oxen checkout -b gpt4-classification-2024-04-01
Name it with model + task + date so you can recognize it later.
3. Run batch inference against the dataset
Now you’ll iterate over the dataset and call a model. Oxen.ai gives you two main patterns:
- Use Oxen.ai endpoints (recommended): call hosted models or your fine-tuned model via API.
- Use your own infra: run inference on your own GPUs/servers, then just push outputs back to the Oxen repo.
A typical workflow with Oxen’s API:
-
Identify the model or endpoint you want to use:
- A base model from Oxen’s catalog (e.g., text classifier, vision model, LLM, etc.)
- A custom model you’ve fine-tuned in Oxen (“zero-code fine-tuning to go from dataset to a custom model in a few clicks,” then “deploy to serverless endpoints in one click”).
-
Write a small script that:
- Reads rows from your dataset file(s)
- Calls the endpoint for each row (or in batches)
- Stores predictions in memory or streams them to a new file
Pseudocode sketch:
import csv
import requests
INPUT_FILE = "data/train.csv"
OUTPUT_FILE = "predictions/train_with_labels.csv"
OXEN_ENDPOINT_URL = "https://api.oxen.ai/v1/endpoints/your-model-id"
OXEN_API_KEY = "YOUR_API_KEY"
def call_model(text):
resp = requests.post(
OXEN_ENDPOINT_URL,
headers={"Authorization": f"Bearer {OXEN_API_KEY}"},
json={"input": text},
timeout=30,
)
resp.raise_for_status()
return resp.json()["prediction"] # depends on the specific endpoint schema
with open(INPUT_FILE, newline="") as fin, open(OUTPUT_FILE, "w", newline="") as fout:
reader = csv.DictReader(fin)
fieldnames = reader.fieldnames + ["model_label"]
writer = csv.DictWriter(fout, fieldnames=fieldnames)
writer.writeheader()
for row in reader:
text = row["text"]
row["model_label"] = call_model(text)
writer.writerow(row)
For large datasets, you’ll want batching, retries, and concurrency, but structurally it’s the same: read from the versioned dataset, call a model, write structured predictions.
4. Save predictions into the repo as a new file or column
Once the script finishes, you’ll have outputs such as:
- A new file (e.g.,
predictions/train_predictions.jsonl) - An updated table with extra columns (e.g.,
train_with_labels.parquet) - Per-sample artifacts (e.g., caption JSON per image)
Add these outputs to your Oxen repo on the inference branch:
# Still on gpt4-classification-2024-04-01 branch
oxen status
# Add the new predictions artifact(s)
oxen add predictions/train_with_labels.csv
# Commit with a descriptive message
oxen commit -m "Batch inference: gpt-4.1 classification on train.csv"
Now you’ve encoded “this inference run” as a single, reviewable commit.
5. Link the inference run to model metadata (optional but recommended)
To keep “which model did we use?” answerable forever, record model details alongside the output. You can:
- Add a small
metadata/inference_run.jsonfile in the repo:- Endpoint ID
- Model name/version
- Hyperparameters or prompt template
- Date/time and script version
Example metadata/inference_run.json:
{
"model_endpoint": "oxen://endpoints/gpt4-classifier-v1",
"base_model": "gpt-4.1",
"dataset_commit": "abc123",
"script_version": "inference_v3.py",
"run_started_at": "2026-04-01T10:05:00Z",
"run_completed_at": "2026-04-01T11:23:14Z"
}
Then add + commit:
oxen add metadata/inference_run.json
oxen commit -m "Record metadata for gpt-4.1 inference run"
6. Review, diff, and optionally merge back to main
Before letting these predictions influence downstream behavior, review them.
You can:
- Compare old vs new tables (e.g.,
train.csvvstrain_with_labels.csv) - Spot-check rows in the Oxen UI with product/creative stakeholders
- Run evaluation scripts against a labeled validation set
When you’re confident the outputs look good:
# Switch to main and merge the inference branch
oxen checkout main
oxen merge gpt4-classification-2024-04-01
oxen push
Now your main branch reflects the dataset + predictions, and the full history of how you got there is preserved.
Common Mistakes to Avoid
- Running inference on an unpinned dataset snapshot: If you just point your script at “latest.csv” in some bucket, you won’t be able to reproduce results later. Always base your inference branch on a specific Oxen commit and record that commit hash in your metadata.
- Overwriting raw data in-place: Don’t replace your original
train.csvwith a version that includes predictions without keeping the original in history or on a separate branch. Use new files or new branches so you can always get back to “raw” vs “model-augmented” data.
Real-World Example
Say you’re building a content moderation system. You’ve got a table of 1M user comments in an Oxen repo (ox/moderation-dataset) and you’ve fine-tuned a text classifier on a subset using Oxen’s “Train Models” flow, then deployed it to a serverless endpoint.
You now want to:
- Run the classifier on all 1M comments
- Store the predicted label + confidence per comment
- Keep the raw comments and model predictions clearly separated
You:
- Checkout
mainonox/moderation-datasetat commitabc123(the curated comments snapshot). - Create a branch
moderation-model-v2-2026-04-01. - Run a batch script that hits your Oxen endpoint and writes
predictions/comments_with_labels.parquet. - Add + commit both the predictions file and
metadata/inference_run.jsonthat references the endpoint. - Share the branch with your trust & safety team; they spot-check a few thousand rows and approve.
- Merge the branch back into
mainso your downstream pipelines can rely on the new “label” and “confidence” columns.
Six months later, when legal asks “exactly which model labeled this content?” you don’t have to guess. You point to the branch, commit, and metadata—everything is versioned and auditable.
Pro Tip: For multi-model comparisons, create one branch per model (e.g.,
llama3-moderation,gpt4-moderation) with identically structured output tables. You can then diff them directly in Oxen or downstream analytics without mixing predictions in the same file.
Summary
Running Oxen.ai batch inference on a dataset and saving the outputs to a new branch/commit gives you a reproducible, reviewable way to evolve your data and models together. You:
- Pin a dataset commit
- Branch for a specific model run
- Run inference and write predictions to a structured file
- Commit outputs (plus metadata) back into the repo
- Review and, if you’re happy, merge to main
That’s the loop: version datasets → fine-tune or select a model → run batch inference → version outputs → iterate. It’s how you move from “it worked once on my laptop” to a disciplined, production-grade AI workflow.