How can I trace a single request across multiple services to find where latency is coming from?

Most teams don’t struggle to detect that something is slow—they struggle to pinpoint exactly where the latency is hiding once a single request hops across a frontend, gateway, and a pile of backend services. That’s the gap distributed tracing is built to close, and it’s exactly the workflow Sentry is designed to make practical for developers.

Quick Answer: Use distributed tracing with shared trace IDs so each hop of a request is captured as spans in a single trace. In Sentry, that lets you follow a request from frontend to backend, see which service or query added latency, and then drill down to the exact code, commit, and owner.

The Quick Overview

What It Is: A distributed tracing workflow in Sentry that lets you follow one request across multiple services using a single trace ID, visualized as spans on a timeline so you can see where latency comes from.
Who It Is For: Developers, SREs, and engineering managers responsible for debugging slow requests in microservices, APIs, and multi-tier web apps.
Core Problem Solved: You know requests are slow, but you can’t see which service, query, or external call is responsible—or how that maps back to actual code and ownership.

How It Works

At a high level, you instrument each service with the Sentry SDK, configure it to start transactions at request boundaries, and propagate a trace header as the request calls downstream services. Sentry stitches these events into a single trace: a tree of spans that show precisely where time is spent.

When you open that trace in Sentry, you’ll see:

The root transaction (for example, GET /checkout in your frontend).
Spans for each downstream operation (backend endpoints, DB queries, third-party APIs).
The critical path latency—the slowest spans that determine end-to-end response time.
Connected context like errors, logs, and Suspect Commits tied to the same trace.

Here’s how that breaks down in practice.

Instrument each service with Sentry SDKs
- Install the Sentry SDK in each component: frontend, gateway, and every backend service.
- Enable tracing in the SDK config so incoming requests become transactions (top-level spans).
- Configure Sentry to capture spans for core operations: HTTP handlers, DB calls, message queue handlers, and external APIs.
Propagate trace context across services
- When Service A calls Service B, include trace headers (for example, sentry-trace and baggage) so the downstream service joins the same trace instead of starting a new one.
- The result is a single distributed trace where each service contributes spans to the same timeline.
- This is what lets you answer, “Did the slowness come from the frontend, the API gateway, or that one reporting service everyone forgot about?”
Debug latency in Sentry’s trace view
- Open the transaction in Sentry and sort by duration or p75 to find your slowest operations.
- Use the waterfall view to see:
  - How long each service took.
  - Which spans are blocking vs. running in parallel.
  - Where downstream calls stacked up.
- Click into a slow span to jump to:
  - The exact line of code.
  - Related errors and logs.
  - Suspect Commits and owners for faster routing.
- If you’ve enabled Session Replay or Profiling, you can also see what the user did right before the slowdown and which functions are burning CPU.

Features & Benefits Breakdown

Core Feature	What It Does	Primary Benefit
Distributed Tracing (Spans)	Captures each operation in a request (HTTP calls, DB queries, background jobs) as spans within a single trace.	Lets you see exactly where a request slows down across services, not just “the request is slow.”
Trace-Aware Issue & Commit Data	Connects traces to errors, Suspect Commits, and Ownership Rules, so slow spans are tied to real code and real teams.	Turns “this span is slow” into “this team and commit likely caused it,” compressing triage time.
Session Replay & Profiling	Replay user sessions and profile code paths tied to the same trace.	Lets you see both what the user experienced and which functions and modules consumed time.
Flexible Quotas for Spans	Lets you configure span volume (spans represent operations within a trace) so you can trace critical paths without blowing up your budget.	Control cost while still capturing enough detail to debug real-world latency issues.
Dashboards & Insights	Build dashboards around slowest transactions and services; use Discover/Insights queries to track regressions over time.	Move from one-off firefighting to ongoing performance monitoring and trend tracking.

Ideal Use Cases

Best for debugging a specific slow request path: Because it lets you follow a single request from frontend to backend (and back) with one trace ID, exposing the slow hop instead of guessing based on logs.
Best for monitoring critical endpoints over time: Because you can chart p75/p95 latency for specific transactions, set alerts when they regress, and drill into the trace to see what changed in the last deploy.

Limitations & Considerations

Requires consistent instrumentation: Each service must have the Sentry SDK installed with tracing enabled and must propagate trace headers. Without that, traces fragment and you lose the “single request across services” view. Start with your most critical paths first (checkout, login, search) and expand coverage.
Span volume and sampling strategies: Capturing every span for every request in a large system can get expensive. Use sampling (for example, keep all errors and a subset of healthy traffic) and focus span collection on operations that matter: high-value endpoints, expensive queries, or known hotspots.

Pricing & Plans

Sentry pricing is built around event volume (errors, transactions/spans, replays, logs) with quotas you can tune and optional pay-as-you-go overages. You control how much tracing data you send and where it’s most valuable (usually on critical user paths).

Typical shape:

Choose a base quota for transactions/spans to support distributed tracing on your most important services.
Add Session Replay and Profiling volume if you want to see user behavior and CPU hotspots tied to the same trace.
Use reserved volume to get discounts when you know your baseline, and the pay-as-you-go pool to handle spikes without turning off visibility.

Examples (not exhaustive; check the pricing page for exact tiers and current numbers):

Developer / Team tiers: Best for smaller teams or individual service owners who need code-level visibility into errors and slow requests with 10–20 dashboards and controlled tracing volume.
Business+ / Enterprise: Best for organizations running many services with governance requirements (SAML + SCIM, audit logs) and higher or more predictable tracing load, plus options like a technical account manager.

Since pricing is usage-based, the practical approach is:

Start tracing a few high-impact endpoints.
Watch how many transactions and spans you generate.
Adjust your quotas and sampling rate to balance coverage and cost.

Frequently Asked Questions

How do I actually trace a single request from the frontend through my backend services?

Short Answer: Instrument every service with the Sentry SDK, enable tracing, and propagate Sentry’s trace headers between services. Sentry will automatically stitch those spans into one trace.

Details:
In a multi-service environment, each incoming request should start a transaction. When that service calls another service, it forwards the trace context (for example, sentry-trace and baggage headers). The downstream service uses that context to join the same trace instead of generating a disconnected one.

In Sentry, you then:

Open the trace from any transaction (frontend or backend).
Use the waterfall to see each service span, ordered chronologically.
Click the slowest spans (for example, POST /charge-card in a payment service) to inspect:
- Latency breakdown.
- DB or external API spans underneath.
- Related errors, logs, and commits.

This gives you a single “story” for that request—from user click to last byte.

How do I find out which service is causing the latency once traces are set up?

Short Answer: Use Sentry’s transaction views and Discover/Insights to sort by duration, then drill into traces to see which spans and services dominate the critical path.

Details:
Once tracing is enabled, you’ll have transactions for endpoints like GET /search and POST /checkout. To find bottlenecks:

Sort by p75 or p95 duration for a transaction to catch typical slowness, not just outliers.
Open a representative slow transaction and inspect the span waterfall:
- Look for spans that are long and on the critical path (blocking the response).
- Note which service and operation they belong to (for example, inventory-service: GET /stock-levels).
Check connecting data:
- Errors attached to the same spans (maybe it’s retrying or timing out).
- Profiling data to see whether CPU-heavy functions are involved.
- Suspect Commits to see if a specific change added extra queries or complexity.

From there, you route the issue using Ownership Rules so the right team gets a Sentry issue with the relevant trace and code context attached.

Summary

Tracing a single request across multiple services to find where latency is coming from isn’t magic—it’s a combination of:

Instrumenting each service with Sentry’s SDK and tracing enabled.
Propagating trace context so all hops share a single trace.
Using Sentry’s trace view, spans, and connected context (errors, commits, profiling, Session Replay) to see where time is actually spent and who owns that code.

Once this is in place, slow requests stop being vague “something in the backend is slow” complaints and become concrete, fixable issues: “p95 checkout latency regressed 200 ms due to this DB query in this service from this commit.”

Next Step

Get Started