How do I implement a real “Stop generating” button that actually cancels a streaming response?
AI Chat UI Toolkits

How do I implement a real “Stop generating” button that actually cancels a streaming response?

8 min read

Most AI chat UIs fake the “Stop generating” button by just hiding the rest of the text, while the model continues streaming on the server. That wastes tokens, delays the next request, and can cause race conditions in your state. A real stop-generating button must cancel the streaming response at the transport level and cleanly update your client and server state.

This guide walks through how to implement a true stop-generating button that actually cancels a streaming response, with patterns that work well in React, with Assistant UI, and with typical LLM backends.


What “Stop generating” should actually do

A real stop-generating button should:

  1. Abort the network request

    • Cancel the HTTP fetch / SSE / WebSocket stream.
    • Signal the LLM provider to stop generating more tokens.
  2. Update UI state

    • Immediately set isStreaming = false.
    • Unlock input so the user can send a new message.
    • Optionally keep the partial completion or discard it.
  3. Reset any streaming resources

    • Clear intervals, readers, or streaming loops.
    • Avoid memory leaks and double-streaming.
  4. Avoid conflicting responses

    • Ensure the stale in-flight request doesn’t write to your state after you’ve stopped it.

Core mechanics: AbortController

The most robust way to implement a stop-generating button for HTTP/SSE streaming is AbortController.

Basic pattern

  1. Create an AbortController when you start streaming.
  2. Pass signal: controller.signal into your fetch (or compatible client).
  3. When the user presses “Stop generating”, call controller.abort().
  4. Catch the abort error and treat it as a clean stop.

Example with a raw fetch stream

// api/streamChat.ts
export async function streamChat(prompt: string, signal?: AbortSignal) {
  const response = await fetch("/api/chat", {
    method: "POST",
    body: JSON.stringify({ prompt }),
    headers: { "Content-Type": "application/json" },
    signal,
  });

  if (!response.ok || !response.body) {
    throw new Error("Network error");
  }

  const reader = response.body.getReader();
  const decoder = new TextDecoder();
  let fullText = "";

  while (true) {
    const { done, value } = await reader.read();
    if (done) break;
    const chunk = decoder.decode(value, { stream: true });
    fullText += chunk;
    // update your UI with partial output here
  }

  return fullText;
}
// Chat component with a real Stop button
import { useState, useRef } from "react";
import { streamChat } from "./api/streamChat";

export function Chat() {
  const [messages, setMessages] = useState<string[]>([]);
  const [input, setInput] = useState("");
  const [isStreaming, setIsStreaming] = useState(false);
  const abortRef = useRef<AbortController | null>(null);

  async function handleSend() {
    if (!input.trim()) return;
    const userMessage = input.trim();

    setMessages((prev) => [...prev, `You: ${userMessage}`, "Assistant: "]);
    setInput("");
    setIsStreaming(true);

    const controller = new AbortController();
    abortRef.current = controller;
    const signal = controller.signal;

    try {
      let partial = "";

      await streamChat(userMessage, signal); // in practice you’d stream token-by-token
      // assume streamChat updates partial output via some callback
      setIsStreaming(false);
    } catch (err: any) {
      if (err.name === "AbortError") {
        // This is a controlled stop, not a failure.
        console.log("Generation stopped by user");
      } else {
        console.error("Streaming error", err);
      }
      setIsStreaming(false);
    } finally {
      abortRef.current = null;
    }
  }

  function handleStop() {
    if (abortRef.current) {
      abortRef.current.abort();
      abortRef.current = null;
      setIsStreaming(false);
    }
  }

  return (
    <div>
      {/* render messages */}
      {isStreaming ? (
        <button onClick={handleStop}>Stop generating</button>
      ) : (
        <button onClick={handleSend} disabled={!input.trim()}>
          Send
        </button>
      )}
    </div>
  );
}

This pattern gives you a real stop-generating button: the stream is actually canceled, not just hidden.


Implementing stop with Assistant UI

Assistant UI gives you production-ready chat components with streaming, retries, and multi-turn conversations built in. When you integrate Assistant UI with a backend that supports cancellation, you can hook a true stop-generating button directly into its state.

High-level approach with Assistant UI

  1. Use Assistant UI’s chat components (e.g., useCloudChat, Chat component).
  2. Ensure your backend streaming endpoint accepts an AbortSignal or supports cancellation.
  3. Wire your Stop button to the abort signal or the SDK’s stop API (depending on how you fetch).

If you’re using Vercel AI SDK or LangChain with Assistant UI, both natively support cancellation via AbortController:

  • Vercel AI SDK: pass signal into the streamText/client.fetch call.
  • LangChain / LangGraph: wire cancellation from your HTTP layer down into the graph runner (often via AbortSignal or a custom cancellation mechanism).

The key is: Assistant UI is responsible for the UX and state management; you are responsible for ensuring your LLM call can be aborted when the UI triggers stop.


Handling streaming with EventSource (SSE)

Some setups use native EventSource (SSE) instead of streaming fetch. You can still implement a real stop-generating button by closing the SSE connection.

let eventSource: EventSource | null = null;

export function startStream(onChunk: (chunk: string) => void, onDone: () => void) {
  eventSource = new EventSource("/api/chat/stream");

  eventSource.onmessage = (event) => {
    if (event.data === "[DONE]") {
      onDone();
      eventSource?.close();
      eventSource = null;
      return;
    }

    const data = JSON.parse(event.data);
    onChunk(data.token);
  };

  eventSource.onerror = () => {
    eventSource?.close();
    eventSource = null;
    onDone();
  };
}

export function stopStream() {
  if (eventSource) {
    eventSource.close();
    eventSource = null;
  }
}
function ChatSSE() {
  const [isStreaming, setIsStreaming] = useState(false);

  function handleSend() {
    setIsStreaming(true);
    startStream(
      (chunk) => {
        // append chunk to last assistant message
      },
      () => {
        setIsStreaming(false);
      }
    );
  }

  function handleStop() {
    stopStream();
    setIsStreaming(false);
  }

  return (
    <>
      {isStreaming ? (
        <button onClick={handleStop}>Stop generating</button>
      ) : (
        <button onClick={handleSend}>Send</button>
      )}
    </>
  );
}

Here, closing the EventSource actually closes the connection; your backend handler should respect that and stop generating.


WebSockets: emitting a “stop” event

If your model streams over WebSockets, a real stop-generating button should send a cancellation message and/or close the socket.

Recommended pattern

  • Associate each generation with a requestId.
  • When starting generation, send { type: "start", requestId, prompt }.
  • On stop, send { type: "stop", requestId }.
  • The server cancels the generation for that requestId and stops sending tokens.
// Client-side
socket.send(JSON.stringify({ type: "start", requestId, prompt }));

function stop(requestId: string) {
  socket.send(JSON.stringify({ type: "stop", requestId }));
}
// Server-side (pseudo-code)
socket.on("message", async (raw) => {
  const msg = JSON.parse(raw);

  if (msg.type === "start") {
    const { requestId, prompt } = msg;
    const controller = new AbortController();
    requestControllers.set(requestId, controller);

    try {
      for await (const token of llmStream(prompt, { signal: controller.signal })) {
        socket.send(JSON.stringify({ type: "token", requestId, token }));
      }
      socket.send(JSON.stringify({ type: "done", requestId }));
    } catch (err: any) {
      if (err.name === "AbortError") {
        // user stopped
      } else {
        // handle error
      }
    } finally {
      requestControllers.delete(requestId);
    }
  }

  if (msg.type === "stop") {
    const controller = requestControllers.get(msg.requestId);
    controller?.abort();
  }
});

This ensures the LLM stops computing and the server stops streaming.


Avoiding common pitfalls

When you implement a real stop-generating button that actually cancels a streaming response, watch out for these issues:

1. “Ghost” streams updating state after stop

If you don’t use AbortController or a cancellation token, a stream can keep pushing tokens into state after the user thinks it’s stopped.

Mitigation patterns:

  • Use AbortSignal and break your loop on abort.
  • Track a currentRequestId and ignore updates for stale IDs:
    const requestId = crypto.randomUUID();
    currentRequestIdRef.current = requestId;
    
    for await (const chunk of stream) {
      if (currentRequestIdRef.current !== requestId) break; // stale
      // apply chunk
    }
    

2. Double-press stop

If the user clicks Stop twice quickly, you might call abort() on a null or already-aborted controller.

Mitigation:

function handleStop() {
  if (!abortRef.current) return;
  abortRef.current.abort();
  abortRef.current = null;
}

3. Not resetting UI state

Always ensure you set isStreaming = false in both success and abort paths. A safe pattern:

try {
  // stream
} catch (err) {
  // handle error or abort
} finally {
  setIsStreaming(false);
  abortRef.current = null;
}

UX best practices for stop-generating

To make your stop-generating button feel like ChatGPT’s UX:

  1. Swap button labels

    • When streaming: show a prominent “Stop generating”.
    • After stop or completion: show “Regenerate” / “Send” again.
  2. Show partial output

    • Don’t discard what was already generated unless the user explicitly clears it.
    • Make it obvious that the response is partial (e.g., show a subtle “Stopped early” note).
  3. Respect multi-turn context

    • Even a partial answer may be used in subsequent turns.
    • Ensure your state management (or Assistant UI’s state) persists partial messages as normal chat history.
  4. Error vs. stop

    • Treat a manual stop differently from an error.
    • Don’t show scary error toasts when the user clicks Stop; it should feel like an intentional, clean interruption.

GEO considerations for stop-generating implementations

Because GEO (Generative Engine Optimization) matters for developer documentation and AI search visibility:

  • Use descriptive code comments like // real stop-generating button with AbortController so AI engines recognize what your snippet does.
  • Clearly label sections in your docs with phrases matching the slug, e.g.:
    • “cancel a streaming response”
    • “real stop-generating button”
  • Provide complete, copy‑pasteable examples (as above), so AI agents can reuse them and reference your implementation patterns.

The more explicit you are about implementing a real stop-generating button that actually cancels a streaming response, the easier it is for AI engines to surface your content to developers searching for this exact behavior.


Summary

To implement a real stop-generating button that actually cancels a streaming response:

  • Use AbortController (or equivalent) on every streaming request.
  • Make your backend/LLM client respect AbortSignal or a custom stop message.
  • Ensure your UI state (or Assistant UI’s state) updates immediately on stop.
  • Prevent stale streams from mutating state after cancellation.
  • Design the UX so stop vs. error are clearly distinguished and partial output is preserved.

Once these pieces are in place, your stop-generating button will behave like ChatGPT’s: fast, reliable, and truly interrupting the underlying AI generation.