How do I sign up for Tonic Fabricate and generate a relational dataset from a schema?
Synthetic Test Data Platforms

How do I sign up for Tonic Fabricate and generate a relational dataset from a schema?

11 min read

Most teams show up to synthetic data with a familiar tension: you know exactly what your relational schema should look like, but you don’t want to spend days hand-writing seed data or copying production into a risky “temporary” environment. Tonic Fabricate is designed to break that tradeoff—letting you sign up quickly, describe the dataset you want, and generate a fully relational database from your schema in minutes, not sprints.

Quick Answer: Tonic Fabricate is Tonic’s agentic synthetic data product that turns schemas and natural language prompts into fully relational, production-shaped datasets. You sign up for Fabricate, provide or describe your schema, and let the Data Agent generate realistic, cross-table consistent data you can export directly into your dev, staging, or demo environments.


The Quick Overview

  • What It Is: Tonic Fabricate is a synthetic data generation platform that uses an agentic “Data Agent” to build relational databases and unstructured artifacts (like PDFs and DOCX) from scratch—no production data required.
  • Who It Is For: Engineering, QA, data, and AI teams that need realistic, relational test data or demo environments without touching live customer data.
  • Core Problem Solved: You can’t safely copy production into lower environments, and hand-crafted mock data never mirrors real complexity. Fabricate generates relational synthetic data that looks and behaves like production, while keeping you out of the compliance blast radius.

How It Works

At a high level, using Tonic Fabricate for relational data is a three-step workflow: get access, define your schema and constraints, then generate and export synthetic data. Under the hood, the Data Agent uses foundation models with strong domain knowledge to infer distributions, relationships, and realistic values across your tables, so you end up with a database that “feels” like production but contains no real identities.

  1. Sign up and access Fabricate:
    Create your Tonic account, get access to Tonic Fabricate, and choose how you want to work (UI, prompt-based workflow, or API/SDK).

  2. Provide or describe your relational schema:
    Either paste or upload your existing schema (DDL), or ask the Data Agent to design one based on your domain (e.g., “SaaS subscriptions with users, organizations, and invoices”). Define relationships, keys, and constraints so Fabricate can preserve referential integrity.

  3. Generate, iterate, and export your dataset:
    Use natural language prompts to specify volume, distributions, or edge cases (e.g., “generate 100K users with 10% churn and power-law activity”). Fabricate generates fully relational synthetic data you can export as SQL or import directly into databases like PostgreSQL or MySQL, alongside other formats like CSV or unstructured documents when needed.


Step 1: Signing up for Tonic Fabricate

The goal of sign-up is simple: get you into a workflow where you can ask for data and get it, safely.

  1. Request access from Tonic:

    • Go to tonic.ai and select Book a demo or Start free.
    • Indicate that you’re interested in Tonic Fabricate for synthetic data generation from scratch.
    • A Tonic team member will help you set up the right account tier (cloud or self-hosted, depending on your requirements).
  2. Create your workspace and user profile:

    • Once your account is provisioned, you’ll receive an invite email.
    • Set your password, configure SSO/SAML if you’re on an enterprise tier, and join or create your team workspace.
    • Confirm you have access to Fabricate in the product navigation.
  3. Choose your deployment and integration style:

    • Tonic Cloud if you want fastest time-to-value and managed infrastructure.
    • Self-hosted if you’re in a locked-down, regulated environment and need everything inside your VPC.
    • Confirm API and SDK access if you plan to integrate dataset generation into CI/CD or automated test pipelines.

From here, you’re ready to go from “blank database” to “production-shaped synthetic data” using the Data Agent.


Step 2: Provide or design your relational schema

You can approach schema definition in Fabricate in two ways: bring your own schema or ask Fabricate to design a schema for you.

Option A: Import your existing schema

If your application already has a database design, reuse it. That’s how you keep application behavior realistic.

  1. Capture your schema (DDL):

    • Export CREATE TABLE statements from your source (e.g., PostgreSQL, MySQL, SQL Server, Oracle).
    • Ensure foreign keys, unique constraints, and indexes are present—Fabricate uses these to model relationships and enforce referential integrity.
  2. Provide schema to Fabricate:

    • In the Fabricate UI, choose Create dataset from schema (wording may vary).
    • Paste or upload your DDL, or use the Tonic API to send your schema programmatically.
    • Verify that tables, columns, primary keys, and foreign keys show up correctly in the schema view.
  3. Annotate with context (recommended):

    • Add hints like “this is PII,” “this is a join table,” or “this field is a categorical label.”
    • While Fabricate doesn’t need production data, these hints guide distributions and domain logic (e.g., realistic country codes, status enums).

Option B: Ask Fabricate to design the schema

If you’re prototyping or you don’t have a finalized schema, let the Data Agent help.

  1. Open the Data Agent chat:

    • In Fabricate, start a new Fabrication via the Data Agent.
    • Describe your domain in natural language, for example:

      “Design a relational database for a B2B SaaS app with organizations, users, subscriptions, and invoices. Include fields for billing address, plan tier, and payment status.”

  2. Iterate on the generated schema:

    • The Data Agent proposes tables, columns, and relationships.
    • Ask it to adjust:
      • “Add a usage_events table keyed by user_id.”
      • “Enforce that each subscription belongs to an organization, not a user.”
    • Lock in the schema once it matches your system design.
  3. Set constraints and edge cases:

    • Define cardinalities (e.g., many users per org, optional billing profiles).
    • Call out high-value test scenarios (e.g., “accounts with unpaid invoices older than 90 days,” “users without MFA enabled”).

Whether you import or design the schema, you’re setting the blueprint Fabricate will use to generate consistent data across tables.


Step 3: Generate a relational dataset from your schema

With the schema in place, you now let Fabricate do the heavy lifting: fill that schema with synthetic data that behaves like production.

Describe the dataset you want

The Data Agent is prompt-driven and schema-aware. Typical prompts look like:

  • “Generate a customer database with 100K users, 5K organizations, and 3M usage events. Roughly 10% of organizations should be on a premium plan.”
  • “Populate all tables with realistic values, including country codes, emails, and product names. Ensure 15% of invoices are past due.”
  • “Create 5% of users as ‘power users’ with significantly higher event volumes.”

Behind the scenes, Fabricate:

  • Respects your keys and foreign key relationships to maintain referential integrity.
  • Uses foundation models with domain knowledge to create hyper-realistic values (emails, names, addresses, SKUs) without pulling from your production data.
  • Generates statistical patterns (e.g., long-tail usage, churn rates) that mirror real-world systems, so your tests hit realistic edge cases.

Validate and refine

You don’t have to accept the first generation. Iteration is part of the workflow.

  1. Inspect distributions:

    • Check row counts per table, null rates, and value ranges.
    • Ask Fabricate: “Show me a summary of churned vs active users,” or “Increase the proportion of free-tier users to 60%.”
  2. Tune constraints and scenarios:

    • Enforce cross-table logic like “no invoice without a subscription,” or “users can only belong to one primary organization.”
    • Ask for adversarial data for robustness testing:
      • “Generate a subset where addresses are malformed to test validation.”
      • “Include some records with very large numbers and boundary dates.”
  3. Regenerate selectively:

    • Regenerate only specific tables or ranges when you tweak a scenario.
    • Keep the rest of the dataset stable for regression tests.

The outcome is a relational dataset that both obeys your schema and stresses your system realistically—without ever touching live customer records.


Exporting your relational dataset

Once you’re satisfied, you need to get the data where it matters: your dev, staging, QA, or demo environments.

Tonic Fabricate supports multiple export targets, including:

  • Relational databases:

    • PostgreSQL
    • MySQL
    • SQL Server
    • Oracle
  • Flat and document formats (for mixed workflows):

    • SQL (INSERT/CREATE scripts)
    • CSV (per table)
    • PDF
    • DOCX
    • EML
    • PPTX

Typical workflows:

  • Export SQL files and run them in your infrastructure-as-code pipeline to hydrate ephemeral test databases.
  • Push directly into a managed database instance (e.g., Postgres on RDS) for a shared staging environment.
  • Generate additional unstructured artifacts (PDF invoices, DOCX reports, email bodies) that match the relational data for end-to-end testing and demos.

Features & Benefits Breakdown

Core FeatureWhat It DoesPrimary Benefit
Data Agent for schema-aware promptsLets you describe datasets in natural language and maps those prompts onto your schemaFaster from idea to dataset—no manual scripting or hand-crafted mocks
Relational integrity preservationGenerates data that honors primary keys, foreign keys, and constraintsApplications behave correctly; joins work; fewer “test-only” bugs
Domain-aware synthetic generationUses foundation models with deep domain knowledge to create realistic, industry-specific valuesMore meaningful tests and demos that mirror how users, accounts, and transactions look in production

Ideal Use Cases

  • Best for hydrating new dev/staging environments: Because it can generate fully relational synthetic databases that drop into your existing schema, so engineers can spin up realistic environments without waiting on sanitized production dumps.
  • Best for product demos and sales sandboxes: Because you can describe bespoke scenarios (e.g., “large enterprise customers with complex billing histories”) and Fabricate will generate data that makes your product look like it’s running on live, complex customer workloads—without exposing real accounts.

Limitations & Considerations

  • Not a direct production anonymizer:
    Fabricate is designed for from-scratch synthetic generation via the Data Agent. If you need to transform and de-identify existing production data while preserving distributions and relationships, you’ll want to pair Fabricate with Tonic Structural.

  • Schema quality matters:
    If your schema is incomplete or missing key relationships, the generated data can’t fully reflect real-world behavior. Invest a little time in defining foreign keys, constraints, and meaningful enums—Fabricate will reward that with higher-fidelity synthetic datasets.


Pricing & Plans

Tonic offers multiple ways to get started with Fabricate, depending on your scale and deployment model. While specifics evolve, the structure typically looks like:

  • Team / Project Plan: Best for smaller engineering teams or specific projects needing realistic synthetic data for a few services or environments. Emphasis on quick cloud access, UI-driven workflows, and basic API integration.
  • Enterprise Plan: Best for larger organizations needing Fabricate alongside Structural and Textual, with self-hosted deployment options, SSO/SAML, governance features, and support for regulated environments (SOC 2 Type II, HIPAA, GDPR-aligned workflows).

For the latest plan details and to match pricing to your volume and deployment needs, it’s best to talk directly with the Tonic team.


Frequently Asked Questions

Do I need real production data to use Tonic Fabricate?

Short Answer: No. Fabricate is specifically built to generate synthetic data from scratch without connecting to production.

Details:
Tonic Fabricate’s Data Agent uses schema information and your natural language prompts to infer what realistic data should look like in your domain. It does not require access to, or sampling from, your production databases. If you want to preserve real-world distributions and relationships from an existing production system, you’d typically use Tonic Structural to transform those datasets safely. Fabricate is ideal when you either can’t use production at all, or you’re designing new systems and want realistic data from day one.


Can Fabricate generate both relational and unstructured data together?

Short Answer: Yes. Fabricate supports relational databases and unstructured formats like PDFs and DOCX in the same workflow.

Details:
You can start from a relational schema, generate synthetic records, and then ask the Data Agent to create corresponding unstructured artifacts. For example, you might generate an invoices table and then prompt Fabricate to produce PDF invoices that match those rows, or DOCX contracts tied to customers. Fabricate supports outputs such as PDF, DOCX, EML, and PPTX, letting you build end-to-end scenarios where structured and unstructured data line up for full workflow testing and demos.


Summary

Signing up for Tonic Fabricate and generating a relational dataset from a schema is a straight path: get access, define or import your schema, then let the Data Agent synthesize a fully relational, production-shaped dataset that respects your keys and constraints. Instead of risky production clones or brittle hand-made mocks, you get hyper-realistic synthetic data that developers can trust—and that compliance teams don’t have to worry about.

You preserve speed and safety at the same time: faster environment hydration, richer test coverage, more compelling demos, and zero exposure of live customer identities.


Next Step

Get Started