
Sourcegraph MCP Server: what’s required to run a pilot connecting our internal AI agents to permissioned code search and code navigation?
Most teams hit the same wall when they try to plug AI agents into real, permissioned code: the agent can’t see the whole codebase, can’t respect access controls, and can’t reliably point back to the exact files and symbols it used. That’s exactly what the Sourcegraph MCP server is designed to fix—by exposing Sourcegraph’s code search and code navigation to your internal agents over a governed, auditable interface.
This guide walks through what’s actually required to run a pilot: infrastructure, identity, data, and rollout steps. The goal is simple: get your first internal AI agents using permissioned code search and navigation via Sourcegraph MCP, without cutting corners on security or access control.
Quick answer: what you need in place
To run a credible pilot with the Sourcegraph MCP server and your internal AI agents, you’ll need:
- A running Sourcegraph instance (cloud or self-hosted) connected to your real code hosts
- The Sourcegraph MCP server enabled and reachable from your agent environment
- OAuth / SSO wired up (SAML, OpenID Connect, or OAuth), plus RBAC configured for least privilege
- A defined pilot scope: which orgs/teams, which repos, which agents, and which workflows
- Agent-side configuration to register Sourcegraph MCP as a tool (Amp, Claude Code, VS Code, Cursor, or your own agent)
- Governance basics: logging, audit expectations, and a plan for rollout and rollback
From there, the pilot is mainly about giving a handful of developers and agent owners a safe playground to test “Agentic AI Search” against real, permissioned code.
Why you need an MCP server for internal AI agents
When your codebase is spread across GitHub, GitLab, Bitbucket, Gerrit, and Perforce—and growing faster with AI—internal agents break for the same reasons humans struggle:
- They can’t search across all repositories and code hosts from one place
- They guess at file locations and API boundaries instead of using symbols and references
- They ignore or bypass the org’s access controls
- They can’t explain “why” an answer is correct with direct links back to the code that informed it
The Sourcegraph MCP server changes that by exposing:
- Universal Code Search across all your repositories and code hosts
- Deep Search (Agentic AI Search) that can return clear, grounded answers even in complex, legacy codebases
- Code Navigation—definitions, references, and symbol lookups—for both humans and agents
Crucially, it does this under the same identity and RBAC model you already enforce, with zero data retention for LLM inference. That’s the baseline you want before you let an agent touch live, permissioned code.
Core components of a Sourcegraph MCP pilot
1. Sourcegraph instance and connectivity
You can’t pilot the MCP server without a real Sourcegraph deployment behind it.
You’ll need:
- Sourcegraph deployed
- Self-hosted (Kubernetes, Docker, or on-prem) or
- Sourcegraph Cloud (for compatible use cases)
- Connected code hosts
- At least one of: GitHub, GitLab, Bitbucket, Gerrit, Perforce
- Ideally the same mix you use in production, even if the pilot uses a subset of repos
- Indexed repositories
- The pilot scope (e.g., “payments org repos” or “core services + shared libraries”) should be cloned and indexed
- Sourcegraph scales whether you have 100 or 1M repositories, so don’t be afraid to include real complexity
- Network reachability
- Your agent environment (e.g., internal dev network, VPC, or container cluster) must be able to reach
<your-sourcegraph>/.api/mcpover HTTP/HTTPS - If you’re self-hosted: ensure firewall rules, proxies, and TLS are in place
- Your agent environment (e.g., internal dev network, VPC, or container cluster) must be able to reach
For most enterprises, the fastest path is: start with the same Sourcegraph instance already backing human developers (Code Search, Batch Changes, Monitors, Insights) and add MCP on top.
2. Identity, SSO, and RBAC for agents
A pilot with internal AI agents is still a production security concern. The agent must not see more code than the human behind it.
Sourcegraph already supports:
- Single Sign-On via SAML, OpenID Connect, and OAuth
- SCIM for automated user provisioning and lifecycle
- Role-based Access Controls (RBAC) for fine-grained permissions
- SOC2 Type II + ISO27001 Compliance and Zero data retention for inference
For a pilot, you’ll want to decide:
- How the agent authenticates
- As the human user (per-user tokens / OAuth flow)
- As a dedicated “agent service account” with constrained permissions
- What the agent is allowed to see
- A subset of repos or projects, controlled via RBAC
- Only specific teams/orgs during the pilot (e.g., “Developer Productivity org” or “Backend Platform”)
This matters because the MCP server will honor Sourcegraph’s access model. If a user or agent token can’t see a repository in Sourcegraph, it won’t see it via MCP either.
3. Sourcegraph MCP server configuration
The Sourcegraph MCP server is now generally available, with OAuth Dynamic Client Registration enabled by default, which makes the agent side much simpler.
At a high level, you:
- Enable and expose the MCP endpoint from your Sourcegraph instance:
- Base URL:
https://sourcegraph.example.com/.api/mcp
- Base URL:
- Confirm OAuth settings:
- Use your existing SSO provider (SAML/OIDC/OAuth) to back the OAuth flow
- Rely on dynamic client registration to avoid manual client provisioning when possible
- Ensure logs and metrics:
- Log MCP access and usage, just like you would for any external API
- Confirm you can see which user or service account is calling MCP and what scopes they use
Once that’s in place, connecting an agent is a single, copy-paste command.
4. Agent-side setup: connecting to Sourcegraph MCP
The Sourcegraph MCP server is designed to be quick to wire into common tools and custom agents.
Typical connection commands look like:
# Amp
amp mcp add sg https://sourcegraph.example.com/.api/mcp
# Claude Code
claude mcp add --transport http sg https://sourcegraph.example.com/.api/mcp
# VS Code
code --add-mcp "{ \"name\": \"sourcegraph\", \"type\": \"http\", \"url\": \"https://sourcegraph.example.com/.api/mcp\" }"
For custom in-house agents, you:
- Implement an MCP client (or reuse an existing MCP library)
- Register the Sourcegraph MCP server endpoint and OAuth configuration
- Define the tools/prompts that instruct your agent when to:
- Run a global search
- Ask Deep Search for a structured answer
- Fetch definitions/references for a symbol
The important part: the agent should treat Sourcegraph as its ground truth for code context—not as a nice-to-have.
5. Pilot scope: code, teams, and workflows
A pilot that’s “everything, everywhere” is hard to measure and easy to derail. Define a crisp slice of reality where code is messy enough to be meaningful, but bounded enough to manage.
Recommended constraints:
-
Code scope:
- 10–200 repositories that matter to one org or business capability (e.g., “billing stack,” “identity platform,” or “mobile + shared services”)
- Include at least one legacy or monolithic system so you see how Deep Search behaves at real complexity
-
Teams / users:
- 5–25 pilot engineers and at least one agent owner (platform or DevEx engineer)
- A mix of senior engineers and newer folks who regularly need to spelunk unfamiliar code
-
Workflows to test:
- “Answer questions about legacy code” with Deep Search
- “Find and understand usages of a function, class, or API” with Code Navigation
- “Locate patterns or anti-patterns before refactoring” with Code Search
- Optionally, feed these results into Batch Changes or internal automation for follow-up edits
Write these down as explicit evaluation criteria before the pilot starts.
Detailed requirements checklist
Infrastructure & security
- Sourcegraph running (cloud or self-hosted)
- Code hosts connected: GitHub, GitLab, Bitbucket, Gerrit, Perforce (as applicable)
- Repositories for the pilot cloned and indexed
- TLS in place for the Sourcegraph endpoint
- Network access from agent environment to
/.api/mcp
Identity, auth, and access control
- SSO configured via SAML, OpenID Connect, or OAuth
- SCIM provisioning (optional, but recommended for user lifecycle)
- RBAC roles defined for pilot users and any agent service accounts
- Decision: per-user auth vs. dedicated agent account
- Confirmation of Zero data retention posture aligned with legal/security expectations
MCP configuration
- MCP server enabled and reachable at
https://<sourcegraph>/.api/mcp - OAuth Dynamic Client Registration enabled (default for GA)
- Logging for MCP access and errors
- Monitoring/alerts for availability during the pilot
Agent integration
- Chosen agents identified (Amp, Claude Code, VS Code, Cursor, internal agents)
- MCP endpoints registered with each agent
- Instructions or scripts ready for pilot users (e.g., commands shown above)
- Tooling prompts wired so agents prefer Sourcegraph for code discovery and navigation
Governance and risk management
- Documented scope of repos and teams included in the pilot
- Written policy for what agents may and may not do (e.g., read only vs. read + propose edits)
- Defined audit expectations: who can review agent queries and results, and how
- Rollback plan if you need to disable MCP access quickly (revoking tokens, disabling MCP endpoint, or agent feature flags)
What “success” looks like for a pilot
A good pilot doesn’t just prove that the MCP server works; it proves that agents and humans can safely share a unified code understanding layer.
Concrete signals you should look for:
-
Agents answer deeper questions about legacy code
- Developers report that the agent can explain non-trivial flows, not just surface-level summaries
- Deep Search can walk through “how does data move from this API to that database table?” with links
-
Access is respected, not bypassed
- No agent can see repos that a user couldn’t open in Sourcegraph directly
- Security and compliance teams sign off that SSO + RBAC behavior matches human access
-
Developers can verify answers in the code
- Every agent response grounded in Sourcegraph MCP includes pointers back to the relevant files, symbols, or search queries
- Engineers use those links to quickly confirm or refine what the agent suggests
-
Agent workflows align with existing Sourcegraph usage
- Engineers move naturally between Deep Search answers, Code Search queries, and Code Navigation
- Teams begin to sketch follow-on workflows (Batch Changes, Monitors, Insights) based on what the agent uncovers
Once you see these, scaling from “pilot” to “standard platform capability for agents” is mostly a question of widening the repo scope and tightening governance.
Practical rollout plan (2–4 weeks)
Here’s a simple timeline many enterprises can follow.
Week 1: Foundations
- Stand up or validate your Sourcegraph instance
- Connect code hosts and ensure pilot repos are indexed
- Confirm SSO, SCIM (if used), and RBAC are in place
- Enable MCP and confirm connectivity from your pilot agent environment
Week 2: Agent wiring
- Configure MCP connections for your target agents (Amp, Claude Code, VS Code, Cursor, internal)
- Run a small set of “golden queries” across the pilot repos to validate:
- Search coverage
- Navigation accuracy
- Access control behavior
- Document pilot scope and rules of engagement
Week 3–4: Pilot usage and feedback
- Roll out to the initial pilot group
- Ask them to use the agent + MCP for:
- Code comprehension on unfamiliar services
- Tracing complex call paths
- Investigating incidents and regressions
- Collect qualitative feedback and a few quantitative metrics:
- Number of Deep Search queries
- Frequency of cross-repo navigation
- Time saved vs. purely manual search
At the end, you should have enough data to decide whether to:
- Expand Sourcegraph MCP to more teams and repos
- Integrate with more agents or add Batch Changes / Monitors / Insights into your agent workflows
- Formalize policies for agent use backed by Sourcegraph’s code understanding platform
Final verdict: what’s really “required” for a credible pilot
You don’t need a perfect, fully standardized environment to start. You do need:
- A real Sourcegraph instance connected to the same GitHub, GitLab, Bitbucket, Gerrit, or Perforce hosts your teams use
- MCP enabled and reachable, with OAuth Dynamic Client Registration and SSO wired up
- RBAC controls that mirror human access, plus a clear choice between per-user and service-account auth
- A well-scoped pilot: a defined set of repos, teams, agents, and workflows
- Governance basics—logging, auditing, and an explicit risk envelope that security and legal can live with
From there, the Sourcegraph MCP server acts as the bridge between your internal AI agents and your permissioned codebase. It gives agents the same fast, comprehensive, exhaustive search and navigation capabilities your developers rely on, while keeping everything under the same identity and compliance posture.
If you want help scoping or standing up that first pilot, you don’t have to guess.