We just had a spike in 500s right after shipping—what’s the fastest way to figure out what changed?
Application Observability

We just had a spike in 500s right after shipping—what’s the fastest way to figure out what changed?

10 min read

When 500s spike right after a deploy, you don’t actually have a “500 problem”—you have a “what changed, where, and who owns it” problem. The fastest way to get back to normal is to tie those errors to the specific release, suspect commits, and code paths that just shipped, then route that information to the right person automatically.

Quick Answer: Use Sentry to correlate your 500s with the latest release, see the exact endpoint and stack trace, trace the error across services, and jump straight to the suspect commit and owner—so you go from “we shipped and it broke” to “this line in this deploy did it” in minutes, not hours.


The Quick Overview

  • What It Is: A workflow in Sentry that connects 500 errors to the specific deploy, code changes, and owners responsible, using Error Monitoring, Tracing, Releases, Suspect Commits, and Session Replay.
  • Who It Is For: Engineering teams who ship frequently and need to debug post-deploy spikes in server errors across frontend, backend, and services.
  • Core Problem Solved: When 500s spike, teams waste time guessing which change caused them; Sentry links “what users saw” to “what changed” and “who should fix it.”

How It Works

At a high level, you wire your app with Sentry’s SDKs, configure releases and tracing, and then let Sentry do three things for you when a spike hits: group the 500s into issues, connect them to a specific release and suspect commits, and show you the full path from user request to failing code.

The flow looks like this:

  1. Instrument & Ship with Releases:

    • Add the Sentry SDK to your services (frontend and backend).
    • Configure releases and environments so every event (including 500s) knows which version of your code it came from.
    • Optionally connect your source control (GitHub/GitLab/Bitbucket) so Sentry knows which commits went out in that release.
  2. See the Spike & Identify the Breaking Change:

    • When 500s spike, Sentry groups them into an issue instead of flooding you with single events.
    • It links that issue to the exact release, shows “First seen / Last seen” timestamps, and points to suspect commits and code owners.
    • Traces and spans show you which endpoint, service, or database call is blowing up.
  3. Jump to Root Cause & Fix:

    • Open the issue, inspect the full stack trace (with local variables for supported languages), and use linked commits and ownership rules to assign it to the right engineer.
    • Use Session Replay (if enabled) and logs/profiling context to see what the user did right before the 500.
    • Fix, redeploy, and watch the issue auto-resolve when the new release stops producing the error.

Features & Benefits Breakdown

Core FeatureWhat It DoesPrimary Benefit
Error Monitoring & Grouped IssuesCaptures 500s as events, groups similar ones into issues, and shows trends (new, regressed, resolved).You see one clear “this broke” issue instead of hundreds of noisy alerts.
Releases, Suspect Commits & Ownership RulesTags each event with release metadata, analyzes commit diffs, and applies Code Owners to route issues.You know which deploy caused the spike and which engineer/team should fix it.
Tracing, Spans & Session ReplayConnects errors to transactions, spans, and optional replays for full request → code → user context.You see exactly which endpoint, service, and user path led to the 500.

Ideal Use Cases

  • Best for post-deploy 500 spikes in production: Because Sentry ties errors to releases and suspect commits, you can quickly answer “what changed right before things exploded?” without digging through deploy logs and dashboards.
  • Best for debugging 500s in distributed systems: Because tracing spans across services, you can follow a failing request from frontend to backend to third-party calls and pinpoint the slow or failing component.

The Fastest Workflow When 500s Spike After a Ship

Let’s walk through what I actually recommend teams do in the “oh no” moment.

1. Confirm You’re Seeing a Real Spike, Not a One-Off

In Sentry:

  • Go to Issues and filter:
    • is:unresolved
    • level:error
    • http.status_code:500
    • environment:production
  • Sort by Events in the last hour or by New.

If you’ve just shipped, you’ll likely see:

  • A new issue with a lot of events in a short window, or
  • An existing issue with a sudden spike.

This tells you: yes, it’s a real incident, not a random user doing something weird.

2. Anchor on the Release: What Changed?

Open the top 500 issue and look at:

  • Release / Version:
    • Sentry shows which release first saw this issue and which release it’s currently happening in.
  • First Seen / Last Seen:
    • If “First seen” is minutes after your deploy, you probably found your regression.
  • Release Health (if mobile / frontend):
    • Quickly see if crash-free sessions dropped for that release.

This quickly answers:

  • “Did this exist before, or is it new to this deploy?”
  • “Is it tied to a single version or multiple environments?”

If you wire releases correctly, you can also:

  • Filter issues by release: release:my-service@1.2.3
  • See all new errors introduced by that deploy in one place.

3. Use Suspect Commits and Ownership to Find Who Should Fix It

In the issue details:

  • Check Suspect Commits:
    • Sentry compares the stack trace frames with your recent commits in that release.
    • It highlights the most likely commits (and authors) that introduced the failing code.
  • Check Ownership Rules / Code Owners:
    • Based on file paths and rules you define, Sentry assigns the issue to the right team or person.

This is where you cut out a ton of Slack back-and-forth:

  • No more “Who owns this endpoint?”
  • No more “Who touched this file last?”

You get “This file changed in this commit by this engineer,” right next to the stack trace.

4. Trace the 500 Through Services

If you’ve enabled Tracing in your Sentry SDK:

  • Go from the error issue to the Related Transaction.
  • Open the transaction to see:
    • The full span timeline (HTTP request, internal service calls, DB queries).
    • Where the error span lives in the broader request.
    • Any slow spans that might be causing timeouts or cascading failures.

This is especially useful when:

  • The frontend logs a 500, but the real issue is in a downstream microservice.
  • You see a chain like: api-gateway → auth-service → billing-service and the error originates in billing-service.

You now know:

  • Which service actually broke.
  • Which endpoint and operation need a fix.

5. Inspect the Stack Trace and Local Variables

Inside the error event:

  • Look at the stack trace:
    • Identify the top “in-app” frame that points to your code (not framework internals).
  • For supported languages (e.g., Python, some others), Sentry can show local variables in the frame:
    • Parameters and variables at the time of the exception.
    • Data you’d otherwise only see by reproducing in a debugger.

Also check:

  • Tags: HTTP method, URL, environment, app version, customer IDs, feature flags, etc.
  • Breadcrumbs: Log lines or preceding steps leading up to the 500.
  • User / Session data: Whether it affects specific accounts or all users.

This lets you answer “what were we actually passing in when it blew up?” without guessing.

6. Watch a User Hit the 500 (with Session Replay)

If you have Session Replay enabled:

  • Open the replay linked from the error event.
  • Watch:
    • The user’s clicks and navigations.
    • Which page or action triggered the 500.
    • Any UI hints (like a missing field or weird state) you wouldn’t see in logs alone.

This is particularly useful when:

  • You suspect a weird edge case in the UI.
  • Product asks “what are users actually experiencing?” and you want to show, not tell.

7. Use Logs and Profiling for Deeper Diagnosis

If you’ve connected Logs and Profiling:

  • From the error or transaction, pivot to:
    • Logs: See log lines around the time of the error, scoped to that trace or user session.
    • Profiling: Identify CPU or memory hotspots that might be causing slowdowns leading to 500s (e.g., timeouts).

You move from “There was an exception” to “This specific code path is slow, then times out, then surfaces as a 500.”

8. Fix, Redeploy, and Verify Resolution

Once you’ve found the breaking change:

  1. Fix the code and ship a new release.
  2. Back in Sentry:
    • Watch the issue’s Events over time graph.
    • Confirm the 500s drop off after the new release.
    • Sentry can automatically mark issues as Resolved in next release and close them when events stop.

You now have a complete loop:

  • Spike detected → Cause identified → Owner notified → Fix shipped → Impact verified.

Limitations & Considerations

  • Initial Setup Required:
    • You need Sentry SDKs configured with releases and tracing before the spike to get the full workflow. Without that, you’ll still see errors, but you’ll lose commit/release context and cross-service traces.
  • Signal vs. Noise Depends on Rules:
    • If you don’t tune alert rules, sample rates, and Ownership Rules, you can end up with too many notifications or misrouted issues. Plan to iterate on these as you see real incident patterns.

Pricing & Plans

Sentry offers a free tier and paid plans that scale by event volume (errors, transactions, replays, etc.) and features.

  • Pricing is based on:
    • Quotas for errors, spans, replays, attachments, and monitors.
    • Optional pay-as-you-go budget for overages.
    • Volume discounts if you reserve usage ahead of time (“pay ahead, save money… when you use more, you pay less”).
    • Seer (AI debugging) is an add-on, priced per active contributor.

In general:

  • Developer / Team plans:
    • Best for small to mid-size teams needing strong error + performance visibility, dashboards (10–20+), and a direct connection from deploy to impact.
  • Business / Enterprise:
    • Best for larger organizations needing SAML + SCIM, detailed audit logs, governance, and support like a technical account manager and dedicated customer support.

For the exact plan matrix, go to Sentry’s pricing page—but the debugging workflow above works on standard paid plans and scales as you add more projects and services.


Frequently Asked Questions

How do I make sure I can always tie 500s to the exact deploy?

Short Answer: Configure Sentry releases in your CI/CD pipeline and include commit data.

Details:
In practice, you want your pipeline to:

  1. Create a Sentry release for each deploy (e.g., my-service@1.2.3).
  2. Associate commits with that release (via your Git provider integration).
  3. Optionally set deploys in Sentry so it knows when that version hit each environment.

Then, every 500 received after that deploy will carry release, environment, and commit context, and Sentry can:

  • Show you “First seen in release X.”
  • Highlight suspect commits and authors.
  • Surface new issues introduced by that release without manual digging.

How can I avoid getting spammed by alerts every time 500s go up?

Short Answer: Use issue-based alerts, thresholds, and environments instead of alerting on every event.

Details:
To stay sane:

  • Configure issue alerts that trigger when:
    • A new issue appears in environment:production.
    • An issue’s event count exceeds a threshold in a time window (e.g., >50 events in 5 minutes).
  • Use environment filters so dev/staging noise doesn’t page you.
  • Use Ownership Rules so alerts go to the team that actually owns the broken code.

The result: you get one targeted alert for “Spike in 500s on POST /orders in production,” not 200 individual error pings.


Summary

When you see a spike in 500s right after a ship, the fastest way to figure out what changed is to connect the error to the release, commit, and code path—not just stare at logs and dashboards. With Sentry wired into your stack, you:

  • See 500s grouped as issues instead of raw noise.
  • Tie those issues to specific releases, suspect commits, and code owners.
  • Follow traces and spans through services, with optional Session Replay, logs, and profiling to see exactly what the user and code did.
  • Fix, ship, and verify the impact without guessing.

Your code is broken. Let’s fix it with less detective work next time.


Next Step

Get Started