TypeScript Response Recording and Resumption
This guide is split into two implementation phases:
The route stays /response-resumption/ for link stability, but the feature name used in the TypeScript docs is “Response Recording and Resumption.”
05a-streaming: switch chat to streaming only.05b-response-resumption: add resume-check/replay/cancel on top of streaming.
For the behavior model, see Response Recording and Resumption Concepts.
Phase 1: Streaming Only (05a)
Starting checkpoint: typescript/examples/vecelai/doc-checkpoints/03-with-history
Switch from generateText to streamText
import { streamText } from "ai"; What changed: Replaces generateText import with streamText.
Why needed: This checkpoint introduces token streaming; streamText is the Vercel AI SDK streaming primitive.
Stream SSE chunks from /chat
res.setHeader("Content-Type", "text/event-stream");
res.setHeader("Cache-Control", "no-cache");
res.setHeader("Connection", "keep-alive"); What changed: Sets SSE response headers.
Why needed: Clients need an open text/event-stream response to receive incremental chunks.
try {
await withMemoryService(
{
...memoryServiceConfig,
conversationId,
authorization,
userText: userMessage,
memoryContentType: "vercelai",
},
async (contextMemory, responseRecorder) => {
contextMemory.append({ role: "user", content: userMessage });
const result = streamText({
model,
messages: [
{
role: "system",
content: "You are a TypeScript memory-service demo agent.",
},
...contextMemory.get(),
],
});
const eventStream = responseRecorder.record(result.textStream);
for await (const event of eventStream) {
const payload =
typeof event === "object" &&
event !== null &&
typeof (event as { chunk?: unknown }).chunk === "string"
? { text: (event as { chunk: string }).chunk }
: event;
res.write(`data: ${JSON.stringify(payload)}\n\n`);
}
contextMemory.append({ role: "assistant", content: await result.text });
},
);
} finally {
res.end(); What changed: Streams model output through responseRecorder.record(...), emits SSE payloads as { text }, and appends await result.text to context memory.
Why needed: Streaming improves UX latency while still persisting one durable assistant turn after generation completes.
Why res.end(): The finally { res.end(); } ensures the SSE response is always closed (success, cancel, or error) so clients do not wait indefinitely on an open HTTP stream.
Make sure you define a shell function that can get the bearer token for the bob user:
function get-token() {
curl -sSfX POST http://localhost:8081/realms/memory-service/protocol/openid-connect/token \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "client_id=memory-service-client" \
-d "client_secret=change-me" \
-d "grant_type=password" \
-d "username=bob" \
-d "password=bob" \
| jq -r '.access_token'
}curl -NsSfX POST http://localhost:9090/chat/a9f2d4c3-7b1e-4d5f-9c44-2e6a8b7c5333 \
-H "Content-Type: text/plain" \
-H "Authorization: Bearer $(get-token)" \
-d "Write a short story about a cat." Example output:
data: {"text":"Once "}
data: {"text":"upon "}
data: {"text":"a time..."} Phase 2: Response Recording + Resumption (05b)
Starting checkpoint: typescript/examples/vecelai/doc-checkpoints/05a-streaming
Install the optional gRPC dependencies used by response recording:
cd typescript/examples/vecelai/doc-checkpoints/05b-response-resumption
npm install @grpc/grpc-js @grpc/proto-loader
Add response recording helpers
memoryServiceConfigFromEnv,
memoryServiceCancel,
memoryServiceReplay,
memoryServiceResumeCheck,
withMemoryService,
} from "@chirino/memory-service-vercelai"; What changed: Adds SDK helpers for resume-check, replay, and cancel.
Why needed: These functions connect the app to Memory Service response-recording APIs.
Record streamed output
await withMemoryService(
{
...memoryServiceConfig,
conversationId,
authorization,
memoryContentType: "vercelai",
historyContentType: "history/vercelai",
userText: userMessage,
},
async (contextMemory, responseRecorder) => {
contextMemory.append({ role: "user", content: userMessage }); What changed: The chat handler callback now accepts responseRecorder as the second parameter from withMemoryService(...).
Why needed: responseRecorder is provided by the SDK helper and is the object used to record live stream output for replay/resume.
await withMemoryService(
{
...memoryServiceConfig,
conversationId,
authorization,
memoryContentType: "vercelai",
historyContentType: "history/vercelai",
userText: userMessage,
},
async (contextMemory, responseRecorder) => {
contextMemory.append({ role: "user", content: userMessage });
const result = streamText({
model,
messages: [
{
role: "system",
content: "You are a TypeScript memory-service demo agent.",
},
...contextMemory.get(),
],
});
const eventStream = responseRecorder.record(result.textStream);
for await (const event of eventStream) {
if (responseRecorder.isCanceled()) {
break;
}
const payload =
typeof event === "object" &&
event !== null &&
typeof (event as { chunk?: unknown }).chunk === "string"
? { text: (event as { chunk: string }).chunk }
: event;
res.write(`data: ${JSON.stringify(payload)}\n\n`);
}
contextMemory.append({ role: "assistant", content: await result.text });
},
);
} finally {
res.end();
} What changed: Wraps result.textStream with responseRecorder.record(...), emits { text } SSE payloads, and appends await result.text to context memory.
Why needed: Recorded chunks can be replayed later if a client disconnects mid-stream.
Why res.end(): The finally { res.end(); } in /chat closes the live SSE stream even when recording is canceled or an error occurs, preventing stuck client connections.
Add resume-check / replay / cancel endpoints
app.post("/v1/conversations/resume-check", async (req, res) => {
const ids = Array.isArray(req.body)
? req.body.filter((v) => typeof v === "string")
: [];
res
.status(200)
.json(await memoryServiceResumeCheck(memoryServiceConfig, ids));
});
app.get("/v1/conversations/:conversationId/resume", async (req, res) => {
let streamed = false;
try {
const eventStream = memoryServiceReplay(
memoryServiceConfig,
req.params.conversationId,
);
res.setHeader("Content-Type", "text/event-stream");
res.setHeader("Cache-Control", "no-cache");
res.setHeader("Connection", "keep-alive");
streamed = true;
for await (const event of eventStream) {
res.write(`data: ${JSON.stringify(event)}\n\n`);
}
} catch {
res.status(404).json({ error: "no in-progress response" });
} finally {
if (streamed && !res.writableEnded) {
res.end(); What changed: Adds /resume-check, /:conversationId/resume, and /:conversationId/cancel endpoints.
Why needed: These endpoints expose reconnect and cancellation controls without changing the /chat request contract.
Security note Any
indexedContentyou attach to history entries is stored as non-encrypted search index data. Redact or minimize sensitive content.
Make sure you define a shell function that can get the bearer token for the bob user:
function get-token() {
curl -sSfX POST http://localhost:8081/realms/memory-service/protocol/openid-connect/token \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "client_id=memory-service-client" \
-d "client_secret=change-me" \
-d "grant_type=password" \
-d "username=bob" \
-d "password=bob" \
| jq -r '.access_token'
}Start a streaming response:
curl -NsSfX POST http://localhost:9090/chat/91b7f6be-1ab8-4a31-b86a-a4fd2c8b2d27 \
-H "Content-Type: text/plain" \
-H "Authorization: Bearer $(get-token)" \
-d "Write a short story about a cat." Example output:
data: {"text":"Once upon a time, there was a curious little cat named Whiskers."}
data: {"text":" She explored the moonlit garden and found a hidden pond."} Check whether a conversation has an in-progress response:
curl -sSfX POST http://localhost:9090/v1/conversations/resume-check \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(get-token)" \
-d '["91b7f6be-1ab8-4a31-b86a-a4fd2c8b2d27"]' Example output:
[] Next Steps
- Sharing - Membership and ownership transfer APIs
- Indexing and Search - Indexed content and search endpoint