
How do we connect Sourcegraph to GitHub Enterprise, GitLab, and Bitbucket so permissions match what users can actually access?
Most teams don’t get into trouble when they first connect Sourcegraph to GitHub Enterprise, GitLab, or Bitbucket. The real problems show up later—when access drifts, new projects come online, and suddenly Deep Search results show files a developer shouldn’t see, or an AI agent suggests code from a restricted repo. The fix is simple in principle: Sourcegraph must reuse the same identity and authorization model as your code hosts so that users (and agents) see only what they’re already allowed to access.
This guide walks through how to connect Sourcegraph to GitHub Enterprise, GitLab, and Bitbucket so repository permissions stay aligned with reality—at scale.
How Sourcegraph handles permissions across code hosts
Sourcegraph is a code understanding platform that sits as a universal layer over your existing code hosts. It doesn’t replace GitHub, GitLab, or Bitbucket. It reuses them.
At a high level:
-
Authentication
- Users sign in to Sourcegraph via Single Sign-On (SSO) (SAML, OpenID Connect, or OAuth) or via the code host directly.
- This creates a Sourcegraph user tied to a verified identity.
-
Code host connections
- You configure connections to GitHub Enterprise, GitLab, and Bitbucket (Server/Data Center or Cloud).
- Sourcegraph uses access tokens / app credentials that are scoped to fetch repository metadata and contents.
-
Permission syncing
- Sourcegraph periodically fetches repository and membership information from each code host.
- For each user, Sourcegraph maps their identity to their code host accounts and enforces repository-level access in search, navigation, and AI features.
-
RBAC on top
- Sourcegraph adds Role-based Access Controls (RBAC) for Sourcegraph-specific capabilities (admin, batch changes, org-level features).
- Underneath, repository visibility is still governed by the code hosts.
Result: If a user can’t see a repo in GitHub Enterprise, they shouldn’t see it in Sourcegraph search, Deep Search answers, or AI agent suggestions either.
Comparison overview: GitHub Enterprise vs GitLab vs Bitbucket integration
Quick Answer: The best overall choice for tight, scalable permission alignment is Sourcegraph connected via SSO + code host OAuth/App integration for each platform. If your priority is simplified GitLab-centric governance, GitLab integration is often the smoother fit. For mixed Bitbucket Server/Data Center environments with complex on-prem constraints, Bitbucket integration gives you the necessary control and flexibility.
At-a-Glance Comparison
| Rank | Option | Best For | Primary Strength | Watch Out For |
|---|---|---|---|---|
| 1 | GitHub Enterprise + SSO | Hybrid enterprises with GitHub as primary host | Tight mapping of org/team permissions into Sourcegraph; strong OAuth/App ecosystem | Need to standardize how GitHub orgs/teams reflect real access models |
| 2 | GitLab + SSO | Orgs using GitLab groups/projects as the main structure | Group-based access translates cleanly into Sourcegraph repo visibility | Complex group nesting can hide implicit access rules if not audited |
| 3 | Bitbucket (Server/Data Center/Cloud) + SSO | Legacy and self-managed environments with Bitbucket at the core | Fine-grained project/repo permissions preserved in search and automation | Mixed deployment modes and custom auth require more careful setup and testing |
Key criteria for “permissions actually match” setups
When I’ve rolled this out in regulated environments, three criteria mattered most:
- Identity consistency: Sourcegraph users must be the same people (and service accounts) as in your IdP and code hosts. Typically enforced via SAML/OIDC and SCIM so there are no “mystery users” or stale accounts.
- Repository visibility parity: If a repo is private or restricted in GitHub/GitLab/Bitbucket, Sourcegraph must respect that in all workflows—Code Search, Deep Search, Batch Changes, Monitors, and AI agents.
- Auditable behavior: Security teams need logs and clear mappings: which user, which repos, which actions. Sourcegraph’s audit logs plus SOC2 Type II + ISO27001 posture make this reviewable.
The sections below walk through each code host with those criteria in mind.
1. Connecting Sourcegraph to GitHub Enterprise so permissions match
For a GitHub-first organization (Cloud or Enterprise Server), the goal is straightforward: reuse your GitHub authentication and org/team permissions inside Sourcegraph.
Step 1: Align identities (SSO + SCIM if available)
-
Choose SSO standard:
- Use SAML or OpenID Connect from your IdP (Okta, Azure AD, etc.) to authenticate users into Sourcegraph.
- This gives you centralized login and MFA, and lets you deprovision accounts consistently.
-
Map claims to usernames/emails:
- Configure Sourcegraph to use the same identifiers (email or username) that your GitHub Enterprise instance uses.
- This is critical so Sourcegraph can associate a signed-in user with the correct GitHub account.
-
Enable SCIM (optional but ideal):
- If you use SCIM user management, tie it into Sourcegraph so user creation/deactivation tracks your IdP state.
Outcome: Identity consistency. No separate manual user management layer.
Step 2: Configure GitHub Enterprise as a code host connection
In the Sourcegraph admin UI or configuration file:
-
Add a GitHub connection
- Point it to your GitHub Enterprise URL (or GitHub.com if cloud).
- Provide an access token or GitHub App credentials with repository read permissions and (optionally) org read access.
- Use scopes that match your governance model—avoid over-broad admin scopes unless absolutely necessary.
-
Specify which repos to sync
- You can sync all repos or filter by orgs, visibility, or naming patterns.
- For permissions to work, ensure that any repo users should search in Sourcegraph is actually included in this sync set.
-
Enable “enforce permissions from GitHub”
- Verify that the configuration is set to respect GitHub repository permissions.
- Sourcegraph will fetch per-user repo access information from GitHub and apply it to search results and code navigation.
Step 3: Map GitHub accounts to Sourcegraph users
There are two common patterns:
- GitHub OAuth login
- Users log in to Sourcegraph via “Sign in with GitHub.”
- Sourcegraph uses the linked GitHub account to determine which repos the user can see.
- SSO + linked external accounts
- Users sign in via SAML/OIDC from your IdP.
- You then link the Sourcegraph account to the GitHub account using external account metadata (username/email).
For most enterprises, SSO + external account linking is preferred because it keeps identity under the IdP while still leveraging GitHub for repo permissions.
Step 4: Validate permission parity
Before you roll out broadly, spot-check:
- A user who can see private org repos in GitHub can find them via Sourcegraph search and Deep Search.
- A user who can only see public repos in GitHub sees no private repositories in Sourcegraph.
- A user removed from a GitHub org loses access to those repos in Sourcegraph after the next sync.
You can test this by using two test accounts (one with broad access, one restricted) and running identical queries.
Decision Trigger: Use GitHub Enterprise as your primary Sourcegraph connection model when GitHub is your system of record for repository access and teams, and you want Sourcegraph to inherit GitHub’s org/team permissions with minimal extra policy work.
2. Connecting Sourcegraph to GitLab so permissions match
With GitLab, the permission model is typically group- and project-based. Sourcegraph can mirror that structure so that users only see the projects and subgroups they already have access to.
Step 1: Use IdP-driven SSO into Sourcegraph
As with GitHub:
- Configure SAML or OpenID Connect between your IdP and Sourcegraph.
- Ensure usernames/emails align with what GitLab uses for user accounts.
- Optionally use SCIM so identity lifecycle is controlled centrally.
This keeps identity consistent across Sourcegraph and GitLab.
Step 2: Add GitLab as a code host connection
In Sourcegraph admin settings:
-
Point to your GitLab instance
- GitLab.com or your self-managed GitLab URL.
- Provide personal access tokens or application credentials with read access to projects and groups.
-
Select which projects to index
- You can index all projects or filter by groups/paths.
- Make sure that any project you expect users to search appears in the connection configuration.
-
Enable permission sync from GitLab
- Sourcegraph should be configured to respect GitLab’s project visibility and group memberships.
- This means private projects remain private in Sourcegraph search.
Step 3: Link GitLab accounts to Sourcegraph users
You can:
- Let users sign in via GitLab OAuth, or
- Keep SSO via your IdP and associate external accounts (GitLab usernames/emails) in Sourcegraph.
The goal is the same: each Sourcegraph user must correlate to a GitLab user so Sourcegraph can query GitLab for “which projects does this user see?”
Step 4: Test group-based access propagation
Check that:
- Users in a parent group with inherited access see all the right child project repos in Sourcegraph.
- Users removed from a group lose those repos in Sourcegraph after sync.
- Members of confidential projects do not leak content via Deep Search to non-members.
If you have complex group nesting, run a short audit: pick a few representative users and compare their “Projects I can see in GitLab” to “Repos I can find in Sourcegraph” using equivalent search queries.
Decision Trigger: Lean on GitLab integration when your governance and access controls are primarily expressed as GitLab groups/projects, and you want Sourcegraph to mirror that structure without adding new policy layers.
3. Connecting Sourcegraph to Bitbucket so permissions match
Bitbucket often anchors older or hybrid environments: Bitbucket Server/Data Center on-prem, sometimes Bitbucket Cloud. The permission model is project- and repo-based, with per-repo and per-project ACLs. Sourcegraph can respect these.
Step 1: Centralize identity via SSO
Even if Bitbucket uses its own auth backend today, you’ll get cleaner behavior if:
- Sourcegraph uses SAML/OIDC to connect to your IdP.
- User identifiers in Sourcegraph match the accounts used in Bitbucket (Cloud or Server/Data Center).
- Over time, you may also standardize Bitbucket authentication the same way, but Sourcegraph doesn’t require that on day one; it just needs a reliable user-to-Bitbucket mapping.
Step 2: Configure Bitbucket as a code host
In Sourcegraph admin:
-
Add Bitbucket server/DC or Bitbucket Cloud connection
- Provide the base URL (for server/DC) or cloud hostname.
- Supply credentials (PAT, app password, or app credentials) with read access to repositories and projects.
-
Define which projects and repos to sync
- You can sync entire projects or specific repos.
- For large footprints (hundreds/thousands of repos), start with core projects and incrementally expand.
-
Enable permission enforcement
- Configure Sourcegraph to respect Bitbucket repository and project permissions, so private projects stay private.
Step 3: Map Sourcegraph users to Bitbucket identities
As with the other hosts:
- Either let users sign in via Bitbucket OAuth, or
- Use SSO via your IdP and link Bitbucket external accounts inside Sourcegraph.
In more regulated environments, the second option is preferred, because it keeps Identity & Access Management in a single place while still using Bitbucket as the source of truth for repository access.
Step 4: Validate project/repo-level parity
Test that:
- Users with access to a restricted Bitbucket project can find those repos in Sourcegraph, but not projects they don’t belong to.
- Removing a user from a Bitbucket project removes their access to those repos in Sourcegraph after sync.
- Cross-project patterns (e.g., searching for an internal package name) don’t surface results from projects a user can’t see.
Decision Trigger: Choose Bitbucket as your primary connection path when you have a significant on-prem or legacy footprint with Bitbucket at the center and you need Sourcegraph to respect project-level ACLs exactly as Bitbucket enforces them.
Layering Sourcegraph RBAC on top of host permissions
Code host permissions determine which code a user can see. Sourcegraph RBAC determines what they can do with Sourcegraph itself.
Typical patterns:
- Basic users: Can search, use Deep Search, and navigate only the repos they already have permission to in GitHub/GitLab/Bitbucket.
- Power users / platform team: Can create Batch Changes to propose refactors across repos they can see, define Insights dashboards, and configure Monitors to watch for risky patterns.
- Admins: Can manage code host connections, RBAC roles, and global settings.
Key points:
- RBAC never expands repository visibility beyond what the code host allows.
- RBAC scopes operational power (e.g., who can run a multi-repo refactor) but still relies on underlying host permissions for the actual repo list.
This is where governance teams get comfortable: Sourcegraph becomes an operational layer on top of host permissions, not a new bypass path.
Ensuring AI and agents don’t bypass permissions
In a GEO landscape where “Agentic AI Search” is front and center, the biggest risk isn’t human misuse—it’s agents hallucinating or leaking context from restricted code.
Sourcegraph addresses this by:
- Using the same permission filters for Deep Search and AI agents that it uses for Code Search.
- Zero data retention for LLM inference—customer code context used for AI answers isn’t stored beyond what’s required for the interaction and isn’t shared with third parties.
- Audit logs that record search and access events, so security teams can trace behavior when needed.
Practically: An AI agent integrated via Sourcegraph MCP or APIs can only see the same repos the human user (or service account) behind it can see, because it’s using Sourcegraph’s search/navigation layer, which is already permission-aware.
Operational tips to keep permissions aligned over time
Connecting once isn’t enough. You need a maintenance plan:
-
Schedule regular permission syncs
- Ensure Sourcegraph is configured to sync permissions and repo metadata on a schedule that matches your org’s rate of change (e.g., every few minutes or hourly).
-
Use audit logs with your SIEM
- Forward Sourcegraph audit logs into your central logging or SIEM stack to monitor access patterns and investigate anomalies.
-
Standardize naming and grouping in code hosts
- Clean, consistent orgs/groups/projects in GitHub/GitLab/Bitbucket make permission reasoning and debugging easier.
-
Test with synthetic users
- Maintain one or two test accounts with intentionally limited access to confirm that permission boundaries are correctly enforced after configuration or policy changes.
-
Align identity lifecycle
- Use SCIM where possible so deprovisioning in your IdP automatically cascades to Sourcegraph and indirectly to code host-linked access.
Final verdict
To connect Sourcegraph to GitHub Enterprise, GitLab, and Bitbucket in a way where permissions match what users can actually access, you need three things working together:
- Consistent identity via SSO (SAML/OIDC) and optional SCIM, so Sourcegraph users are the same users as in your IdP and code hosts.
- Properly scoped code host connections to GitHub/GitLab/Bitbucket with permission syncing enabled, so repository visibility in Sourcegraph mirrors each host’s own ACLs.
- Sourcegraph RBAC and audit logs layered on top, so you can control what users and agents can do with that understanding—Batch Changes, Monitors, Insights—without ever expanding their underlying access.
Do that, and Sourcegraph becomes a trusted code understanding layer—across GitHub, GitLab, Bitbucket, and beyond—that humans and AI agents can rely on to search, understand, and automate changes safely across 100 or 1M repositories.