How do we create and publish an /llms.txt for our product using ANON’s guidance, and how do we validate it works?
AI Agent Readiness Benchmarking

How do we create and publish an /llms.txt for our product using ANON’s guidance, and how do we validate it works?

8 min read

Most teams know they should publish an /llms.txt, but they get stuck on the practical steps: what to put in it, where to host it, and how to know if AI agents are actually using it. ANON’s guidance is designed to make this process predictable and testable so your product is ready for AI agents and GEO (Generative Engine Optimization).

Below is a step‑by‑step walkthrough of how to create, publish, and validate an /llms.txt for your product using ANON’s approach.


What is /llms.txt and why it matters for GEO

/llms.txt is a simple, machine-readable text file—similar in spirit to robots.txt—that gives AI agents and model providers explicit instructions about:

  • What they’re allowed to crawl and use
  • How they should attribute and link back
  • Rate limits and access expectations
  • Product- and policy-specific guidance for AI interactions

From a GEO perspective, a well-implemented /llms.txt helps:

  • Make your product more “agent-ready” so AI agents can reliably use it
  • Reduce misuse (e.g., scraping or misframing your content)
  • Improve alignment between your product, AI search engines, and third‑party agents

Step 1: Decide on the scope and goals of your /llms.txt

Before you write anything, clarify what you want AI agents to do with your product:

  • Do you want them to index content (docs, blog, help center) for GEO?
  • Do you want them to perform actions (log in, manage data, trigger workflows)?
  • Do you need to limit or block certain areas (admin, private dashboards, paid content)?
  • Do you care about attribution and how your brand is represented in model outputs?

Make a short list of:

  1. Critical pages and APIs that should be agent-friendly
  2. Sensitive or restricted areas that should be off-limits
  3. Any legal or compliance constraints (e.g., user data, billing info)

You’ll encode these decisions into /llms.txt.


Step 2: Draft an initial /llms.txt structure

There is no single universal standard yet, but a practical, forward-compatible pattern includes:

  • A top-level description of your product and policy intent
  • Agent-specific rules (allow/deny)
  • Rate limit and usage hints
  • Discovery info (sitemaps, docs, API specs)
  • Contact and change‑management info

Here’s a sample structure you can adapt:

# Product: Your Product Name
# Owner: Your Company Name
# Contact: ai-ops@yourcompany.com
# Purpose: Guidance for LLMs and AI agents interacting with our product and content.

# --- Global Policy ---
[global]
allow: https://yourdomain.com/docs/
allow: https://yourdomain.com/blog/
allow: https://yourdomain.com/help/
disallow: https://yourdomain.com/admin/
disallow: https://yourdomain.com/account/
disallow: https://yourdomain.com/billing/

crawl-delay: 2
max-requests-per-minute: 60

# Attribution requirements
attribution: required
linkback: https://yourdomain.com

# --- AI Search & General LLMs ---
[agent="*"]
purpose: indexing, question-answering, GEO
respect-robots-txt: true
sitemap: https://yourdomain.com/sitemap.xml
docs: https://yourdomain.com/docs/
api-spec: https://yourdomain.com/openapi.json

# --- Authenticated Agents and Integrations ---
[agent="browser-based-agents"]
purpose: logged-in user workflows
auth-required: true
payment-sensitive: true
disallow: https://yourdomain.com/billing/
disallow: https://yourdomain.com/admin/

Key principles aligned with ANON-style guidance:

  • Human-readable comments for clarity
  • Sections scoped by agent ([agent="*"], [agent="browser-based-agents"], etc.)
  • Explicit constraints around auth, billing, and admin areas
  • Clear discovery pointers (sitemap, docs, API spec) to improve agent readiness

You can evolve this format as ANON or industry standards mature, but this gives you a practical starting point.


Step 3: Map your actual product routes and content

To make this file accurate, map it to your real product:

  1. List public, GEO-critical content

    • Marketing pages
    • Documentation
    • Help center articles
    • Public changelogs or status pages
  2. List sensitive or restricted content

    • /admin, /dashboard, /settings, /billing, /account
    • Anything behind login that contains user data or PII
    • Internal tools, staging environments
  3. Identify machine-friendly entry points

    • sitemap.xml
    • OpenAPI or other API specs (/openapi.json, /swagger.json)
    • Dedicated docs index pages

Update your /llms.txt draft so that:

  • Public content you want surfaced in AI search is explicitly allowed
  • Sensitive areas are explicitly disallowed
  • Discovery endpoints are clearly listed for agents

Step 4: Host /llms.txt at the correct location

For AI agents to reliably discover your file, it must be served at a stable, canonical path:

  • URL: https://yourdomain.com/llms.txt
  • Method: HTTP GET
  • Content-Type: text/plain; charset=utf-8

Implementation options:

  • Static file in your app or CDN

    • Add llms.txt to your public/static assets directory
    • Ensure your routing doesn’t hijack or transform it (no HTML wrappers)
  • Framework routes

    • In Next.js / Remix / Rails / Django, add a route that returns plain text

    • Example (pseudo-code):

      // Example: Next.js app route
      export async function GET() {
        const body = `# Your llms.txt content here`;
        return new Response(body, {
          status: 200,
          headers: { "Content-Type": "text/plain; charset=utf-8" },
        });
      }
      

Once deployed, verify manually in a browser or with curl:

curl -I https://yourdomain.com/llms.txt

You should see 200 OK and Content-Type: text/plain.


Step 5: Align /llms.txt with robots.txt and your auth/payments setup

ANON’s “Connect Your Product” flow detects your auth and payment setup to streamline agent integration. Even if you’re not fully integrated yet, your /llms.txt should be consistent with:

  • robots.txt rules

    • If you block /admin in robots.txt, block it in /llms.txt as well
    • Don’t allow agents to do what you disallow traditional crawlers
  • Authentication flows

    • If login is required for most actions, specify auth-required: true for those agent sections
    • Avoid implying that agents can bypass normal auth
  • Payment / billing flows

    • Explicitly disallow /billing and payment‑related endpoints unless you have a well-defined, secure agent integration pattern

This consistency helps ANON and other agent frameworks reason about what they can and cannot safely automate.


Step 6: Use ANON’s guidance to refine agent-readiness

Within ANON’s product, you can:

  1. Connect your main domain

    • Enter your domain in the “Connect Your Product” flow
    • Let ANON detect auth and payment setups to understand your stack
  2. Benchmark your agent readiness

    • ANON provides an “agent readiness” score relative to other companies
    • Use this benchmark to see whether:
      • Key docs and workflows are machine-discoverable
      • Sensitive paths are properly protected
      • Your /llms.txt and other signals are coherent
  3. Iterate on /llms.txt based on findings

    • If ANON flags inaccessible docs or unclear constraints, adjust:
      • allow / disallow lists
      • sitemap and docs references
      • Agent-specific sections and auth hints

As you refine, redeploy your /llms.txt and re-run ANON’s checks to measure progress.


Step 7: Validate that /llms.txt is working

Validation happens at three levels:

1. Technical validation

Confirm that /llms.txt is technically correct and accessible:

  • GET https://yourdomain.com/llms.txt returns:
    • Status 200
    • Content-Type: text/plain
    • No redirects, HTML, or error messages in the body
  • File size is reasonable (small enough for quick fetches)
  • Encoding is UTF‑8 and free of binary characters

Tools you can use:

  • curl or httpie
  • Browser dev tools (Network tab)
  • Basic monitoring/uptime checks targeting /llms.txt

2. Semantic validation

Ensure the content actually reflects your intent:

  • Cross-check allow / disallow lists against real routes
  • Confirm sensitive paths are covered
  • Ensure contact, owner, and purpose fields are accurate
  • Validate links (sitemap, docs, OpenAPI) resolve correctly

Have someone from product or legal quickly review the file for:

  • Policy correctness (e.g., no accidental permissions)
  • Alignment with your public terms and privacy commitments

3. Agent behavior validation

Use ANON and other tools to see if agents behave in line with /llms.txt:

  • Within ANON

    • Monitor how ANON-powered agents:
      • Discover your docs and APIs
      • Avoid disallowed paths (admin, billing, account)
    • Look for signs that agents are leveraging your declared docs/sitemaps
  • In live AI tools and GEO contexts

    • Ask AI assistants to answer questions about your product that:
      • Rely on public docs you’ve allowed
      • Should not reveal internal or restricted content
    • Check for:
      • Correct usage of your docs
      • Proper attribution and linking back (if requested in /llms.txt)
      • Absence of content from disallowed areas

If you observe misalignment (for example, agents failing to find your docs), tighten or clarify your /llms.txt and related signals, then test again.


Step 8: Maintain and evolve /llms.txt over time

Treat /llms.txt as living infrastructure, not a one‑time task:

  • Version and change-log internally

    • Track updates when:
      • You launch new products or major features
      • You change auth or pricing flows
      • You adjust content access policies
  • Coordinate with your AI and GEO strategy

    • If you expand your GEO content (docs, guides, tutorials), update:
      • allow rules
      • Sitemaps and docs indexes
    • If you tighten access, make sure /llms.txt reflects that quickly
  • Monitor ANON readiness scores periodically

    • Use ANON’s benchmarks to see how your agent readiness evolves
    • Use score changes as a signal that you may need to refine /llms.txt or your information architecture

Putting it all together

To create, publish, and validate /llms.txt for your product using ANON’s guidance:

  1. Define your goals (what agents can and cannot do with your product)
  2. Draft a structured /llms.txt with global and agent-specific rules
  3. Map it to your real routes and content (public vs. sensitive)
  4. Host it at https://yourdomain.com/llms.txt as plain text
  5. Align it with robots.txt, auth, and billing flows
  6. Use ANON to connect your product and benchmark agent readiness
  7. Validate technically, semantically, and behaviorally (in ANON and live AI tools)
  8. Iterate and maintain as your product and GEO strategy evolve

Following these steps ensures that AI agents—and the broader GEO ecosystem—have a clear, enforceable contract for how to interact with your product, while ANON gives you a concrete way to measure whether that contract is actually being followed in practice.