
After a deploy, users say “it’s broken” but I can’t reproduce it—how do I capture what actually happened in production?
When someone says “it’s broken” right after a deploy and you can’t reproduce it, what you’re really missing isn’t effort—it’s production context. You need a way to see exactly what the user saw, what the code did, and what changed in that release, without guessing or redeploying debug logs into production.
Quick Answer: Use Sentry’s SDKs to capture production errors, performance data (transactions and spans), and Session Replays, then tie those events to releases and deploys. That gives you a clickable path from “user said it’s broken” to the exact stack trace, slow span, replay, and suspect commit that caused it.
The Quick Overview
- What It Is: A developer-first way to capture what actually happened in production—errors, slowdowns, and user sessions—right after a deploy, then trace it back to the exact code and release.
- Who It Is For: Engineering teams who ship frequently, hear about issues from users or support, and struggle to reproduce “it’s broken” reports in staging.
- Core Problem Solved: Fragmented debugging context. Instead of logs in one place, performance in another, and vague user reports, Sentry connects errors, traces, replays, and deploy metadata into a single, debuggable issue.
How It Works
At a high level, you instrument your application with Sentry SDKs. When users hit errors or performance issues in production, the SDK sends events to Sentry: errors/exceptions, transactions (performance), spans, and optional Session Replays. Sentry enriches these events with environment, release, and commit data, groups them into issues, and ties them to deploys. You can then pivot from an alert or user complaint straight into “what happened in that session and where did the code slow down or fail?”
-
Instrument your app with Sentry SDKs:
Add Sentry SDKs to your frontend and backend (and any services in between). Configure them to capture:- Errors/exceptions with full stack traces and local variables (where supported).
- Performance data via transactions and spans.
- Session Replays to see user clicks, page transitions, and console logs.
- Custom tags (tenant ID, account tier, feature flags) to filter and reproduce.
-
Connect events to releases and deploys:
When you deploy, send release and commit metadata to Sentry:- Releases identify the exact version users were on when “it’s broken” happened.
- Deploys and changesets let Sentry surface Suspect Commits—likely offending code.
- Ownership Rules route issues to the right team based on paths, tags, or URLs.
-
Debug from issue → trace → replay → fix:
When a report comes in:- Find the error or performance issue in Sentry (via alert, Discover, or Releases).
- Open the transaction trace to see which spans are slow or failing across services.
- Watch the associated Session Replay to see exactly what the user did.
- Use logs, profiling, and commit context to identify root cause, then push a fix and verify in the same workflows.
Features & Benefits Breakdown
| Core Feature | What It Does | Primary Benefit |
|---|---|---|
| Error Monitoring & Stack Traces | Captures exceptions with stack traces, local variables (where supported), environment, and release data. Groups similar errors into issues. | You see the exact line of code and conditions that broke in production, instead of trying to guess from a vague “it’s broken.” |
| Tracing (Transactions & Spans) | Records end-to-end transactions across services (e.g., frontend → backend → third-party APIs) and breaks them into spans with timing. | You can see where the slowdown or failure occurred in a real user’s path, not just that “the page is slow.” |
| Session Replay | Reconstructs user sessions as pixel-perfect replays with DOM state, navigation, and console logs tied to errors and performance events. | You see what the user did and saw—clicks, screens, and errors—so you can reproduce the issue without reproducing the user. |
You can further enrich this with Logs, Profiling, and Insights so when that one post-deploy bug hits production, you already have a full forensic record ready.
Ideal Use Cases
- Best for post-deploy regressions: Because it ties new errors and slowdowns to the latest release, so you can see “what changed,” which users were impacted, and which commit is most likely at fault.
- Best for elusive, unreproducible bugs: Because Session Replay, tags, and environment context make it possible to debug issues that only occur under specific user states, feature flags, or environments.
How Sentry Captures “What Actually Happened” After a Deploy
Let’s walk the workflow like I’d do in a Sentry workshop.
1. Capture the error when it happens (not after someone complains)
With the SDK configured, Sentry automatically captures:
- Exceptions with:
- Full stack traces (including local variables for supported platforms, like Python).
- Request data (method, URL, headers, body size).
- Environment and app version.
- Custom tags you define:
customer_id,plan,feature_flag_x,region, etc.- This lets you filter to “only hits enterprise customers in EU” or “only when feature_x is on.”
Instead of “it broke on checkout,” you’ll see an issue like:
- Error:
TypeError: cannot read property 'foo' of undefined - Release:
frontend@3.14.0 - Environment:
production - Tags:
plan=premium,feature_flag_new_checkout=true
So now you can immediately scope both who is impacted and under what conditions.
2. See the performance impact with tracing
A lot of “it’s broken” reports are really “it’s slow” or “it’s timing out.” For that, you need tracing:
- Configure the SDK to send transactions (e.g., route loads like
/checkout,/search, or backend endpoints). - Within those transactions, instrument spans that represent specific operations:
- Database queries
- External API calls
- Rendering phases
- Heavy functions
When a user hits a regression:
- Sentry shows you the transaction (e.g.,
GET /checkout) with:- Duration
- Breakdown of spans
- Which spans are slow or errored
- Connected errors (exceptions) within that request
This is how you go from “someone said checkout is slow” to “this PostgreSQL query in the orders service regressed from 50 ms to 500 ms after release X.”
3. Watch what the user actually did with Session Replay
Sometimes the stack trace and trace are correct, but you still can’t reproduce the bug. That’s where Session Replay closes the loop:
- Replays are tied to errors and performance events.
- From an issue, you can click into “View Replay” and:
- Watch the user navigate and interact with the app.
- See console errors and warnings alongside UI behavior.
- Observe how they got into a bad state (e.g., toggling a feature flag, navigating back/forward, or an edge-case form input).
This is particularly useful when:
- You have stateful behavior that’s hard to mock in tests.
- The bug only happens on certain devices, locales, or screen sizes.
- UX issues aren’t technically “errors” but absolutely feel broken to the user.
4. Tie issues to releases and suspect commits
To understand “what changed” when “it’s broken” starts happening:
- Use Sentry Releases:
- Send release identifiers from your CI/CD pipeline.
- Attach changesets (commits) to each release.
- Mark deployments (environment + time + release).
Sentry then:
- Shows new errors and performance regressions by release.
- Highlights Suspect Commits—the commits most likely responsible for a new issue.
- Lets you see adoption—how many users are on the broken version vs previous ones.
This is how you go from “we deployed 20 commits and something broke” to “this commit touching payment_service.py is likely the culprit.”
5. Route to the right team with Ownership Rules
No one wants to be the engineer who gets paged for issues they can’t fix. Ownership matters:
- Define Ownership Rules (by file path, URL, tag, etc.).
- Map paths and signals to teams (e.g.,
team:payments,team:frontend). - Sentry uses these to:
- Auto-assign new issues.
- Send alerts to the people who can actually fix them.
When “it’s broken” hits right after a deploy, you want the issue to skip the triage queue and go straight to the owning team with all relevant context attached.
Limitations & Considerations
-
You still have to instrument intelligently:
Sentry captures a lot out of the box, but the best results come when you:- Add meaningful tags/contexts (e.g., user roles, feature flags).
- Define transactions across key flows (login, checkout, search).
- Add spans where you know performance is critical.
Think of the SDK as part of your application code, not a bolt-on.
-
Not a magic “no more bugs” button:
Sentry won’t prevent all bugs (no one can). What it does is:- Shorten time to detection with alerts.
- Shorten time to root cause with context.
- Make it easier to prove you fixed the right thing with release and regression tracking.
You still need tests, CI, and code review. Sentry just makes the inevitable production issues a lot less painful.
Pricing & Plans
Sentry is usage-based: you pay for the events you send (errors, transactions, replays, etc.), with quotas and optional pay-as-you-go overages.
At a high level:
-
Developer / Team plans:
- Good for smaller teams or individual services.
- Include Error Monitoring, basic Tracing, and limited Session Replay.
- Quotas for errors, transactions, replays, plus a limited number of dashboards (e.g., 10 on Developer, 20 on Team).
-
Business+ / Enterprise:
- For organizations that need scale, governance, and support.
- Includes:
- Larger/custom quotas and lookback windows.
- SAML-based SSO and SCIM for user provisioning.
- Organization audit logs.
- Options for enterprise support and a technical account manager.
- Data residency choices (US or Germany), with SOC 2 Type II, ISO 27001, and HIPAA attestation.
You can start small, see how much production context you use, and then tune quotas or add pay-as-you-go overages as your event volume grows.
- Developer/Team: Best for product engineering teams needing visibility into post-deploy issues and slowdowns without heavy governance overhead.
- Business+/Enterprise: Best for larger organizations needing centralized governance, SSO/SCIM, and higher-volume error and performance data, plus advanced controls.
Frequently Asked Questions
How does Sentry help when the bug only happens in production and not locally?
Short Answer: It captures production-only conditions—real user inputs, environments, and state transitions—and ties them to stack traces, traces, and replays so you don’t have to recreate production in your laptop.
Details:
Many issues never occur in local or staging environments because:
- Data is different.
- Traffic patterns are different.
- Feature flags, region configs, and user behaviors are different.
Sentry’s SDK captures:
- The exact stack trace with local variables (supported on certain runtimes like Python).
- Request metadata (URL, headers, user agent).
- Release and environment tags.
- Session Replay with clicks, navigation, and console logs.
So when someone says “it’s broken only in prod,” you’re not trying to simulate production from memory. You’re inspecting the real event and watching the real interaction that triggered it.
Can Sentry help prevent broken code from reaching production in the first place?
Short Answer: Yes. Sentry can surface likely issues during PR/code review using your historical error and performance context, and you can use performance and error trends to gate deploys.
Details:
Sentry isn’t just for “after it’s broken”:
- AI Code Review / PR context (beta):
Uses your error and performance history plus commit context to flag risky changes before merge, so you break production less often. - Release Health & Performance:
Track crash-free sessions, performance degradation, and new error rates per release. If a new release looks bad, you can revert or roll forward with a fix quickly. - Alerts & thresholds:
Define alerts for:- New issues or spikes in errors.
- Regression in transaction duration for key endpoints.
- Drops in crash-free users/sessions.
This doesn’t replace tests, but it gives you a safety net tied to how your code actually behaves in real environments.
Summary
When users say “it’s broken” after a deploy and you can’t reproduce it, the real bottleneck is missing production context. Sentry solves that by:
- Capturing errors with full stack traces, local variables, and environment tags.
- Recording performance traces so you can see which spans and services regressed.
- Linking to Session Replays so you can watch what the user actually did.
- Connecting everything to releases, deploys, and suspect commits so you know what changed.
- Routing issues to the right owners via Ownership Rules and keeping the rest of your team informed.
The result: instead of days of guesswork, you’re often down to minutes from user report → root cause → fix.