
Sourcegraph MCP Server: what’s required to run a pilot connecting our internal AI agents to permissioned code search and code navigation?
Most teams reach the same inflection point: you’ve got internal AI agents, you’ve got a sprawling, permissioned codebase, and you need a safe way to give those agents real code understanding—without rewriting your identity model or punching holes in governance. That’s exactly the gap the Sourcegraph MCP server is designed to fill.
This guide walks through what’s actually required to run a pilot: technical prerequisites, security and identity decisions, scope boundaries, and a practical rollout plan that connects your internal AI agents to Sourcegraph’s permissioned code search and code navigation.
Quick answer: what “ready for a pilot” really means
You’re ready to pilot the Sourcegraph MCP server with internal agents when you have:
- A running Sourcegraph instance with access to a realistic subset of your repos
- Identity wired up (SSO strongly recommended), plus basic RBAC defined
- OAuth Dynamic Client Registration enabled (default) so agents can connect cleanly
- At least one AI agent that can speak MCP (Amp, Claude Code, Cursor, VS Code MCP, or your own)
- A clear, scoped use case (e.g., “read-only code search + navigation in 200 services")
- Agreement on data-handling rules (zero data retention for inference, logging posture, etc.)
From there, connecting an agent is usually measured in minutes, not weeks.
Why bring Sourcegraph MCP into your AI stack?
AI agents fail in legacy, multi-repo, heavily permissioned environments for three reasons:
- They can’t see enough of the codebase to answer real questions.
- They can’t reliably navigate to the right symbols, files, and patterns.
- They don’t respect the same access model as humans—and that’s a non-starter in regulated orgs.
The Sourcegraph MCP server fixes that by exposing Sourcegraph’s code understanding platform—Code Search, Deep Search, Code Navigation, and more—to agents via a standard MCP interface. The result:
- Truly universal coverage. One search surface across GitHub, GitLab, Bitbucket, Gerrit, Perforce, and more. Whether 100 or 1M repositories.
- Agentic AI Search with guardrails. Agents query Deep Search and navigation APIs but stay inside your identity, permissions, and audit boundaries.
- No data sprawl. Zero data retention for LLM inference. Agents get the context; you keep control.
A pilot is about proving those three things inside your own environment.
Core components of a Sourcegraph MCP pilot
1. Sourcegraph deployment and repo connectivity
To give agents permissioned code search and code navigation, you need a Sourcegraph instance that can see your code.
Minimum for a pilot:
- Sourcegraph instance deployed
- Self-hosted (on-prem or private cloud) for most regulated enterprises
- Access to your internal network and code hosts
- Connected code hosts: At least one of:
- GitHub (Cloud or Enterprise Server)
- GitLab
- Bitbucket
- Gerrit
- Perforce
- Representative repos:
- Start with a realistic subset—e.g., 50–500 repos that mirror your production patterns
- Include at least one high-complexity, legacy service; that’s where agents need help the most
Why this matters for the pilot
You’re not just testing connectivity. You’re testing whether agents can answer questions like:
- “Where is the implementation of this feature across services?”
- “Show me all call sites of this API, including in Perforce-hosted legacy code.”
- “Navigate from this controller to its underlying database access layer.”
That requires real code sprawl, not a toy repo.
2. Identity, SSO, and RBAC
If your agents have more access to code than your humans do, security will (rightly) kill the pilot. The goal is simple: make agents operate under the exact same access model as a user.
Recommended identity setup for the pilot:
- Single Sign-On (SSO) via:
- SAML
- OpenID Connect
- OAuth
- SCIM user management (if you already use it) to keep accounts in sync
- Role-based Access Controls (RBAC) in Sourcegraph aligned to:
- Standard engineering personas (dev, SRE, security, contractor)
- Scoped permissions for the service accounts your agents will use
For the pilot, you can keep roles simple:
- A dedicated “AI-Agent-Pilot” role with:
- Read-only access
- Visibility restricted to your chosen pilot repos
- One or more service accounts mapped to that role, each representing a distinct agent or workspace.
Why this matters
When you connect MCP, agents will authenticate against Sourcegraph and inherit these permissions. That ensures:
- Agents can only search and navigate within repos they’re allowed to see
- Any Deep Search or navigation results are constrained to your RBAC model
- You have a clear, auditable perimeter for the pilot
3. OAuth Dynamic Client Registration for agents
The Sourcegraph MCP server is GA with OAuth Dynamic Client Registration enabled by default, which is what makes agent onboarding fast.
What you need to confirm:
- Your Sourcegraph instance has:
- MCP server enabled (depends on your version and config)
- OAuth Dynamic Client Registration still enabled (it is by default)
- Your security team is aligned on:
- How agents will authenticate (client credentials vs. delegated user)
- What scopes/permissions are granted to each agent’s client
This gives you a straightforward flow:
- Agent initiates MCP connection to
https://<your-sourcegraph-host>/.api/mcp - OAuth client is dynamically registered (within your defined policy)
- Agent retrieves tokens and begins making MCP calls against Sourcegraph
Why this matters
You avoid hand-creating OAuth clients for every agent and IDE. And you keep a centralized view of who’s talking to Sourcegraph and under what identity.
4. MCP-compatible agents and tools
Next, you need at least one AI agent or IDE that can speak MCP. Out-of-the-box examples include:
- Amp
- Claude Code
- VS Code (with MCP configured)
- Cursor
Connecting them to your Sourcegraph MCP server usually looks like:
# Amp
amp mcp add sg https://sourcegraph.example.com/.api/mcp
# Claude Code
claude mcp add --transport http sg https://sourcegraph.example.com/.api/mcp
# VS Code (example CLI-based config)
code --add-mcp '{
"name": "sourcegraph",
"type": "http",
"url": "https://sourcegraph.example.com/.api/mcp"
}'
For internal agents (your own orchestrators or chatbots):
- Implement MCP client behavior:
- HTTP transport
- OAuth-based authentication using dynamically registered credentials
- Call Sourcegraph MCP tools to:
- Perform code search and Deep Search
- Navigate definitions, references, and symbol hierarchies
- Retrieve file contents and diffs in a controlled way
Pilot tip: Start with a single “champion” agent—e.g., your internal coding assistant—and wire Sourcegraph MCP into that path before expanding to everything.
5. Scoping the pilot: permissions, repos, and use cases
A good pilot is intentionally constrained. It’s big enough to be realistic but small enough to be safe.
Scope dimensions to define up front:
- Repo scope:
- Include 50–500 repos covering:
- At least one legacy monolith
- Several key services or libraries
- A mix of GitHub/GitLab and Perforce if you have both
- Include 50–500 repos covering:
- Permission scope:
- Read-only for agents
- No access to archives that violate your data policies (e.g., certain PII-heavy repos)
- User scope:
- 5–20 engineers who already touch the selected repos regularly
- Ideally: a mix of app teams, platform, and security or SRE
Use cases to validate:
- Code discovery and navigation
- “Show me where this exception is thrown across the codebase.”
- “Navigate from this GraphQL resolver to its underlying database writes."
- Cross-repo impact analysis
- “Where are we calling this deprecated API across all services?”
- Legacy code understanding
- “Explain how this service authenticates requests end-to-end.”
- “Find all call sites that skip this security check.”
These map directly to what Sourcegraph already excels at for humans—Code Search, Deep Search, Code Navigation—just now surfaced through MCP for your agents.
Security, data handling, and compliance for the pilot
In a regulated environment, your security and compliance teams will want concrete answers before you connect any agent to anything.
Data posture
Sourcegraph’s AI posture centers on:
- Zero data retention for LLM inference
- Inference data isn’t retained beyond what’s required to serve the request
- No sharing of your code or search queries outside of your governed environment
Across the pilot, you should be able to say:
- Agents access only the code and metadata Sourcegraph already has, under the same identity model
- No additional code copies are created outside your infra beyond transient inference usage
- Every agent query and response can be traced back to:
- The Sourcegraph MCP call that served it
- The user or service identity that initiated it
Identity and access checks
Before you start the pilot, confirm:
- SSO is configured via SAML, OpenID Connect, or OAuth
- RBAC is defined such that:
- Your agent service accounts have the smallest useful set of permissions
- There is a clear mapping between human owners and each agent account
- SCIM (if used) is correctly provisioning and deprovisioning users
Auditability
You’ll want a basic audit story:
- Logs of MCP tool invocations and search queries per identity
- The ability to answer “What code did this agent see between these timestamps?”
- A clear path to revoke credentials and disable MCP access if needed
This is where Sourcegraph’s enterprise posture—SOC2 Type II + ISO27001 Compliance, SSO, SCIM, RBAC—aligns with how you already treat human access.
Pilot success criteria and metrics
Going in, define how you’ll measure whether connecting agents to Sourcegraph MCP is worth scaling.
Common success criteria:
- Accuracy and usefulness of answers
- Reduction in “hallucinated” answers that don’t match the actual code
- Increase in answers accompanied by direct, linked code references
- Search and navigation leverage
- Number of questions resolved by agents without human manual search
- Time saved for common tasks (finding call sites, tracing flows)
- Governance confidence
- Security sign-off that agent access is equal to or stricter than human access
- No incidents of over-permissioned access or code exposure
Practical metrics:
- Queries per day per user routed through Sourcegraph MCP
- Percentage of agent answers that cite Sourcegraph search or navigation results
- Number of cross-repo changes discovered or planned based on agent-led analysis
Example pilot architecture
At a high level, your pilot looks like this:
-
Codebase and Sourcegraph
- Sourcegraph connected to GitHub, GitLab, Bitbucket, Gerrit, and/or Perforce
- Repos synced and indexed for Code Search, Deep Search, and navigation
-
Identity and security
- SSO via SAML/OIDC/OAuth
- RBAC defining which repos the “AI-Agent-Pilot” role can see
- Service accounts for each agent, with OAuth credentials
-
MCP server and agents
- Sourcegraph MCP server exposed at
https://<your-sourcegraph-host>/.api/mcp - OAuth Dynamic Client Registration enabled
- Agents (Amp, Claude Code, Cursor, VS Code, or your internal agent) configured to:
- Authenticate via OAuth
- Call MCP tools for search and navigation
- Sourcegraph MCP server exposed at
-
Developers and workflows
- Developers chat with agents in their existing tools
- Agents call Sourcegraph MCP for:
- Code search / Deep Search
- Definitions, references, and symbol lookups
- Developers click through to Sourcegraph when they need deeper inspection or to turn understanding into action via Batch Changes, Monitors, or Insights (even if that’s outside the pilot’s scope).
Suggested 30–60 day pilot plan
Week 1–2: Foundations
- Deploy or confirm your Sourcegraph instance
- Connect to GitHub/GitLab/Bitbucket/Gerrit/Perforce
- Configure SSO, SCIM (if used), and RBAC
- Enable and validate MCP server and OAuth Dynamic Client Registration
- Create pilot service accounts and roles for agents
Week 2–3: Agent integration
- Connect one or two MCP-capable agents (e.g., Amp, Claude Code, Cursor, or your internal agent)
- Validate:
- Authentication and token flow
- MCP search and navigation calls
- Permission boundaries (agents can’t see out-of-scope repos)
Week 3–6: Controlled rollout and measurement
- Onboard 5–20 pilot engineers
- Encourage real tasks:
- Root cause analysis in legacy services
- “Where is this behavior implemented?” questions
- “Find all uses of this pattern” queries
- Collect:
- Example queries and answers
- Cases where agents now succeed where they previously failed
- Governance and security feedback
At the end, you’re deciding whether to:
- Expand repo coverage
- Expand agent coverage (more teams, more tools)
- Turn on more workflows: Batch Changes for automated multi-repo edits, Monitors for risky pattern detection, Insights for tracking change over time.
Final verdict: what’s truly required
To run a credible, low-risk pilot connecting your internal AI agents to permissioned code search and navigation via Sourcegraph MCP, you need:
- A Sourcegraph instance wired into your real code hosts and repos
- Enterprise-grade identity and RBAC that agents can reuse
- The Sourcegraph MCP server with OAuth Dynamic Client Registration enabled
- At least one MCP-capable agent integrated and tested
- A scoped, auditable pilot with clear security guardrails and success criteria
You don’t need to refactor your entire AI stack. You don’t need to flatten your permissions. You just need to let your agents use the same universal code search and navigation layer your developers should already have.
When you’re ready to design that pilot for your environment—and map it to your identity, compliance, and AI strategy—my strong recommendation is to talk directly with the Sourcegraph team.