
How do I sign up for Oxen.ai and create my first private repo?
Most teams hit the same wall the first time they try to bring AI work into versioned, reproducible workflows: where do we actually put the data, and how do we keep it private while we iterate? Signing up for Oxen.ai and creating your first private repo is the fastest way to move from ad‑hoc S3 buckets and random folders to a disciplined, Git-like workflow for datasets and models.
Quick Answer: To sign up for Oxen.ai, go to oxen.ai/register, create a free Explorer account, and verify your email. Once you’re in, use the “New Repository” flow, choose Private, and initialize your first repo for datasets or model weights directly in the UI or via the CLI.
Why This Matters
If you can’t answer “which data trained which model?” on demand, you’re flying blind. Private repositories in Oxen.ai give you a controlled space to version datasets, model weights, and other large AI assets without exposing them publicly, while still collaborating with a small team.
Instead of pushing massive binaries through Git or juggling half-zipped archives in S3, you get a single source of truth with history, access control, and a clean path from dataset → fine-tuned model → deployed endpoint.
Key Benefits:
- Own your AI assets: Keep training data, model weights, and evaluation artifacts in private repos while maintaining full version history.
- Collaborate securely: Invite only the teammates who need access (up to plan limits) and keep sensitive data out of public view.
- Ship faster with discipline: Move from dataset upload to fine-tuning and deployment in a few clicks, without losing track of which version is in production.
Core Concepts & Key Points
| Concept | Definition | Why it's important |
|---|---|---|
| Oxen.ai account | Your identity and workspace on Oxen.ai, created via email/password, magic link, or GitHub login. | Required to create, own, and collaborate on repositories, run models, and manage billing. |
| Private repository | A version-controlled repo in Oxen.ai where only invited collaborators can view or modify data and model artifacts. | Lets you keep sensitive datasets and proprietary models locked down while retaining Git-like history and collaboration. |
| Explorer plan limits | The free “Explorer” tier with unlimited public repos, 5 private repositories (max 3 collaborators), 50 GB storage, and 50 GB transfer. | Sets the guardrails for your first private repo and helps you plan how many datasets/models to onboard without surprise. |
How It Works (Step-by-Step)
At a high level, the workflow is:
- Sign up and verify your Oxen.ai account.
- Create your first private repo.
- Initialize it with data and wire it into your model workflow.
Below is the step-by-step breakdown.
1. Create and verify your Oxen.ai account
-
Go to the registration page
Open: https://www.oxen.ai/register -
Choose how you want to sign up
- Email + password:
- Enter your username or email address.
- Set a password.
- Click Sign up (or equivalent onboarding button).
- GitHub login:
- Click Continue with GitHub.
- Authorize Oxen.ai to use your GitHub identity.
- Magic link option (if available in your region/UI):
- Enter your email.
- Click Send a magic link.
- Open the email and click the link to log in without a password.
- Email + password:
-
Verify your email (recommended)
- Check your inbox for a verification or welcome email.
- Click the verification link to confirm your account.
- This helps prevent access issues later and is often required for certain features.
-
Log in to your account
- Go to https://www.oxen.ai and click Log in.
- Use:
- Log in with password, or
- Continue with GitHub, or
- A magic link if that’s your normal flow.
- If you ever forget your password, click Forgot password? Recover on the login page.
At this point, you have an Oxen.ai account on the Explorer (Free Forever) plan, which includes:
- Unlimited public repositories with unlimited collaborators.
- 5 private repositories, maximum 3 collaborators each.
- 50 GB of data storage.
- 50 GB of data transfer.
You can upgrade later to Hacker or Pro if you outgrow these limits.
2. Create your first private repository
Once you’re logged in:
-
Open the repositories view
- From the Oxen.ai dashboard, navigate to your Repositories (or similar) section.
- Look for a New Repository or Create Repo button.
-
Start the “New Repository” flow
- Click New Repository.
- You’ll typically be asked for:
- Repository name:
- Use something descriptive and artifact-first, e.g.,
customer-support-chat-logs,fashion-product-images-v1, orpodcast-transcripts-en.
- Use something descriptive and artifact-first, e.g.,
- Description (optional but recommended):
- Explain what lives here and how it’s used, e.g.,
“Labeled support conversations for fine-tuning a response ranking model.”
“Product image dataset for multimodal search experiments.”
- Explain what lives here and how it’s used, e.g.,
- Repository name:
-
Set the repo visibility to Private
- In the visibility section, select Private.
- This ensures:
- Only you and invited collaborators can see or clone the repo.
- Datasets, model weights, and evaluation logs stay out of public search and discovery.
-
Confirm plan limits
- If this is your first private repo on Explorer, you’re well within limits.
- If you already have several private repos, keep in mind:
- Max 5 private repositories on Explorer.
- Up to 3 collaborators per private repo.
- If you hit the limit, you’ll see prompts to either archive/delete another private repo or Sign in to upgrade to Hacker or Pro.
-
Create the repo
- Click Create or Create Repository.
- Oxen.ai will create your empty private repo with an initial main branch (or equivalent).
You now have a private, versioned space ready for datasets, labels, and model artifacts.
3. Initialize the repo with data and wire it into your workflow
With your private repo created, the next step is to bring it to life:
-
Upload your initial dataset or assets
- In the UI, open the new repo.
- Use Upload or Add Files to bring in:
- CSV/Parquet tables (tabular datasets).
- Text/JSONL files (for LLM training/eval).
- Images, audio, or video files (for multimodal tasks).
- Model weights or checkpoints.
- Oxen.ai will version these files as part of the repo, similar to how Git versions code.
-
Organize data into a clear layout
- Keep it predictable from day one:
data/raw/for raw dumps.data/clean/for processed/filtered data.labels/orannotations/for human labels.models/for weights.
- It’s much easier to maintain “dataset lineage” when each transformation step has its own directory and commit.
- Keep it predictable from day one:
-
Add a README describing usage
- In your repo, create a
README.mdexplaining:- What the dataset/model contains.
- Where the data came from.
- How it’s used in model training/fine-tuning.
- Any privacy/compliance caveats.
- This is especially important in private repos, where future collaborators won’t have public docs to lean on.
- In your repo, create a
-
Invite collaborators (optional)
- From the repo settings, add collaborators’ emails or usernames.
- Remember the Explorer plan constraint:
- Max 3 collaborators on private repos.
- For each collaborator, assign the right level of access:
- Read-only (for reviewers, product, creative).
- Write access (for ML engineers, data scientists).
-
Connect to models, fine-tuning, and deployment
- Once your dataset lives in a private repo, you can:
- Use it to fine-tune a model in Oxen.ai’s UI with zero-code flows.
- Explore the model library and test models against your private data (within your account).
- Deploy fine-tuned models to serverless endpoints in one click for use in apps and services.
- The repo becomes the backbone of your loop:
- Upload → curate → fine-tune → deploy → iterate.
- Once your dataset lives in a private repo, you can:
Common Mistakes to Avoid
-
Using public repos for sensitive data:
Always choose Private for anything containing user data, proprietary content, or licensed assets. Public repos are great for demos and benchmarks, not production customer data. -
Treating the repo like a dump folder:
Avoid random, unstructured uploads with no README. Version control is only as useful as the structure and documentation you impose. Take 5 minutes to define folders and commit messages that future-you will understand. -
Ignoring storage and transfer limits:
On the Explorer plan, you have 50 GB storage and 50 GB transfer. If you upload a single 49 GB blob, you leave no room for iterations or additional datasets. Start smaller, and consider upgrading if you’re routinely working with very large assets. -
Not tracking which data trained which model:
Keep your training scripts, configs, and a simple log in the repo that tags:- Dataset version (commit/branch).
- Model architecture and base checkpoint.
- Training run metadata. That’s how you answer “which data trained which model?” in a single glance.
Real-World Example
Imagine you’re leading an internal project to fine-tune a support chatbot on your company’s real conversation logs. Those logs are sensitive: they contain user issues, internal policy references, and sometimes personally identifiable information (PII). You can’t toss that into a public GitHub repo or an open S3 bucket and hope nobody notices.
Here’s what you do instead:
- You go to oxen.ai/register and create an Explorer account.
- You create a private repository called
support-chat-logs-finetune. - You upload:
- An anonymized, filtered dataset under
data/clean/2026-04-01/. - A label file indicating sentiment, resolution status, or escalation labels.
- A README documenting the anonymization process and usage policy.
- An anonymized, filtered dataset under
- You invite two collaborators—one data scientist, one product manager—staying under the 3-collaborator limit.
- In Oxen.ai’s UI, you kick off zero-code fine-tuning of a base LLM using this dataset.
- Once the model meets internal benchmarks, you deploy it to a serverless endpoint directly from the platform, wired into your internal tools.
The dataset never leaves your private repo, every version is tracked, and you can always tell which dataset version trained the model that’s live in production.
Pro Tip: Treat each major dataset change (new labeling guidelines, new data sources, new filters) as a commit with a clear message—e.g.,
feat: add Q1 2026 tickets, remove PII fields. When a model suddenly improves (or regresses), you can correlate that behavior directly to the commit that changed the training data.
Summary
Signing up for Oxen.ai and creating your first private repo is the fastest way to put discipline around your AI assets without building infrastructure from scratch. You get:
- A free Explorer account with 5 private repos, 3 collaborators, and 50 GB each of storage and transfer.
- Private, versioned repositories where datasets and model artifacts stay secure but fully traceable.
- A clean path from data to deployed model: version datasets → fine-tune in a few clicks → deploy to serverless endpoints.
Instead of asking “where did this model come from?” on release day, you’ll be able to point to a specific private repo, commit, and dataset version.