TypeScript Response Recording and Resumption

This guide is split into two implementation phases:

The route stays /response-resumption/ for link stability, but the feature name used in the TypeScript docs is “Response Recording and Resumption.”

  1. 05a-streaming: switch chat to streaming only.
  2. 05b-response-resumption: add resume-check/replay/cancel on top of streaming.

For the behavior model, see Response Recording and Resumption Concepts.

Phase 1: Streaming Only (05a)

Starting checkpoint: typescript/examples/vecelai/doc-checkpoints/03-with-history

Switch from generateText to streamText

app.ts
import { streamText } from "ai";

What changed: Replaces generateText import with streamText.

Why needed: This checkpoint introduces token streaming; streamText is the Vercel AI SDK streaming primitive.

Stream SSE chunks from /chat

app.ts
  res.setHeader("Content-Type", "text/event-stream");
  res.setHeader("Cache-Control", "no-cache");
  res.setHeader("Connection", "keep-alive");

What changed: Sets SSE response headers.

Why needed: Clients need an open text/event-stream response to receive incremental chunks.

app.ts
  try {
    await withMemoryService(
      {
        ...memoryServiceConfig,
        conversationId,
        authorization,
        userText: userMessage,
        memoryContentType: "vercelai",
      },
      async (contextMemory, responseRecorder) => {
        contextMemory.append({ role: "user", content: userMessage });
        const result = streamText({
          model,
          messages: [
            {
              role: "system",
              content: "You are a TypeScript memory-service demo agent.",
            },
            ...contextMemory.get(),
          ],
        });
        const eventStream = responseRecorder.record(result.textStream);
        for await (const event of eventStream) {
          const payload =
            typeof event === "object" &&
            event !== null &&
            typeof (event as { chunk?: unknown }).chunk === "string"
              ? { text: (event as { chunk: string }).chunk }
              : event;
          res.write(`data: ${JSON.stringify(payload)}\n\n`);
        }
        contextMemory.append({ role: "assistant", content: await result.text });
      },
    );
  } finally {
    res.end();

What changed: Streams model output through responseRecorder.record(...), emits SSE payloads as { text }, and appends await result.text to context memory.

Why needed: Streaming improves UX latency while still persisting one durable assistant turn after generation completes.

Why res.end(): The finally { res.end(); } ensures the SSE response is always closed (success, cancel, or error) so clients do not wait indefinitely on an open HTTP stream.

Make sure you define a shell function that can get the bearer token for the bob user:

function get-token() {
  curl -sSfX POST http://localhost:8081/realms/memory-service/protocol/openid-connect/token \
    -H "Content-Type: application/x-www-form-urlencoded" \
    -d "client_id=memory-service-client" \
    -d "client_secret=change-me" \
    -d "grant_type=password" \
    -d "username=bob" \
    -d "password=bob" \
    | jq -r '.access_token'
}
curl -NsSfX POST http://localhost:9090/chat/a9f2d4c3-7b1e-4d5f-9c44-2e6a8b7c5333 \
  -H "Content-Type: text/plain" \
  -H "Authorization: Bearer $(get-token)" \
  -d "Write a short story about a cat."

Example output:

data: {"text":"Once "}

data: {"text":"upon "}

data: {"text":"a time..."}

Phase 2: Response Recording + Resumption (05b)

Starting checkpoint: typescript/examples/vecelai/doc-checkpoints/05a-streaming

Install the optional gRPC dependencies used by response recording:

cd typescript/examples/vecelai/doc-checkpoints/05b-response-resumption
npm install @grpc/grpc-js @grpc/proto-loader

Add response recording helpers

app.ts
  memoryServiceConfigFromEnv,
  memoryServiceCancel,
  memoryServiceReplay,
  memoryServiceResumeCheck,
  withMemoryService,
} from "@chirino/memory-service-vercelai";

What changed: Adds SDK helpers for resume-check, replay, and cancel.

Why needed: These functions connect the app to Memory Service response-recording APIs.

Record streamed output

app.ts
    await withMemoryService(
      {
        ...memoryServiceConfig,
        conversationId,
        authorization,
        memoryContentType: "vercelai",
        historyContentType: "history/vercelai",
        userText: userMessage,
      },
      async (contextMemory, responseRecorder) => {
        contextMemory.append({ role: "user", content: userMessage });

What changed: The chat handler callback now accepts responseRecorder as the second parameter from withMemoryService(...).

Why needed: responseRecorder is provided by the SDK helper and is the object used to record live stream output for replay/resume.

app.ts
    await withMemoryService(
      {
        ...memoryServiceConfig,
        conversationId,
        authorization,
        memoryContentType: "vercelai",
        historyContentType: "history/vercelai",
        userText: userMessage,
      },
      async (contextMemory, responseRecorder) => {
        contextMemory.append({ role: "user", content: userMessage });
        const result = streamText({
          model,
          messages: [
            {
              role: "system",
              content: "You are a TypeScript memory-service demo agent.",
            },
            ...contextMemory.get(),
          ],
        });
        const eventStream = responseRecorder.record(result.textStream);
        for await (const event of eventStream) {
          if (responseRecorder.isCanceled()) {
            break;
          }
          const payload =
            typeof event === "object" &&
            event !== null &&
            typeof (event as { chunk?: unknown }).chunk === "string"
              ? { text: (event as { chunk: string }).chunk }
              : event;
          res.write(`data: ${JSON.stringify(payload)}\n\n`);
        }
        contextMemory.append({ role: "assistant", content: await result.text });
      },
    );
  } finally {
    res.end();
  }

What changed: Wraps result.textStream with responseRecorder.record(...), emits { text } SSE payloads, and appends await result.text to context memory.

Why needed: Recorded chunks can be replayed later if a client disconnects mid-stream.

Why res.end(): The finally { res.end(); } in /chat closes the live SSE stream even when recording is canceled or an error occurs, preventing stuck client connections.

Add resume-check / replay / cancel endpoints

app.ts
app.post("/v1/conversations/resume-check", async (req, res) => {
  const ids = Array.isArray(req.body)
    ? req.body.filter((v) => typeof v === "string")
    : [];
  res
    .status(200)
    .json(await memoryServiceResumeCheck(memoryServiceConfig, ids));
});

app.get("/v1/conversations/:conversationId/resume", async (req, res) => {
  let streamed = false;
  try {
    const eventStream = memoryServiceReplay(
      memoryServiceConfig,
      req.params.conversationId,
    );
    res.setHeader("Content-Type", "text/event-stream");
    res.setHeader("Cache-Control", "no-cache");
    res.setHeader("Connection", "keep-alive");
    streamed = true;
    for await (const event of eventStream) {
      res.write(`data: ${JSON.stringify(event)}\n\n`);
    }
  } catch {
    res.status(404).json({ error: "no in-progress response" });
  } finally {
    if (streamed && !res.writableEnded) {
      res.end();

What changed: Adds /resume-check, /:conversationId/resume, and /:conversationId/cancel endpoints.

Why needed: These endpoints expose reconnect and cancellation controls without changing the /chat request contract.

Security note Any indexedContent you attach to history entries is stored as non-encrypted search index data. Redact or minimize sensitive content.

Make sure you define a shell function that can get the bearer token for the bob user:

function get-token() {
  curl -sSfX POST http://localhost:8081/realms/memory-service/protocol/openid-connect/token \
    -H "Content-Type: application/x-www-form-urlencoded" \
    -d "client_id=memory-service-client" \
    -d "client_secret=change-me" \
    -d "grant_type=password" \
    -d "username=bob" \
    -d "password=bob" \
    | jq -r '.access_token'
}

Start a streaming response:

curl -NsSfX POST http://localhost:9090/chat/91b7f6be-1ab8-4a31-b86a-a4fd2c8b2d27 \
  -H "Content-Type: text/plain" \
  -H "Authorization: Bearer $(get-token)" \
  -d "Write a short story about a cat."

Example output:

data: {"text":"Once upon a time, there was a curious little cat named Whiskers."}

data: {"text":" She explored the moonlit garden and found a hidden pond."}

Check whether a conversation has an in-progress response:

curl -sSfX POST http://localhost:9090/v1/conversations/resume-check \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $(get-token)" \
  -d '["91b7f6be-1ab8-4a31-b86a-a4fd2c8b2d27"]'

Example output:

[]

Next Steps