Python LangGraph Over Unix Domain Sockets

This guide continues from Python LangGraph Response Recording and Resumption. The checkpoint app stays the same; only the Memory Service connection changes.

Local Memory Service

Note: The memory-service CLI has not yet been released as a standalone binary. You can install it from source with Go:

go install -tags "sqlite_fts5 sqlite_json" github.com/chirino/memory-service@latest
memory-service serve \
  --db-kind=sqlite \
  --db-url=file:$HOME/.local/share/memory-service/memory.db \
  --vector-kind=sqlite \
  --cache-kind=local \
  --unix-socket=$HOME/.local/run/memory-service/api.sock

Those options keep the LangGraph example self-contained on one machine: SQLite handles durable storage and vector indexing, cache-kind=local removes the need for Redis or another cache service, and the socket path stays under the user’s home directory instead of /tmp so local access stays tied more closely to normal account permissions.

Agent Configuration

No LangGraph source file needs a transport edit. The checkpoint already constructs the Memory Service helpers with their default configuration:

app.py
checkpointer = MemoryServiceCheckpointSaver.from_env()
history_middleware = MemoryServiceHistoryMiddleware.from_env()


async def call_model(state: MessagesState) -> dict[str, list[Any]]:
    messages = [{"role": "system", "content": "You are a helpful assistant."}] + list(
        state["messages"]
    )
    response = await model.ainvoke(messages)
    return {"messages": [response]}


builder = StateGraph(MessagesState)
builder.add_node("call_model", call_model)
builder.add_edge(START, "call_model")
graph = builder.compile(checkpointer=checkpointer)

app = FastAPI(title="LangGraph Chatbot with Response Recording and Resumption")


@app.get("/ready")
async def ready() -> dict[str, str]:
    return {"status": "ok"}
install_fastapi_authorization_middleware(app)
proxy = MemoryServiceProxy.from_env()
recording_manager = MemoryServiceResponseRecordingManager.from_env()

Set the socket path in the file that supplies environment variables for the running process. If you keep those in a .env file next to the app, this .env.example snippet shows the exact value to add:

.env.example
MEMORY_SERVICE_UNIX_SOCKET=$HOME/.local/run/memory-service/api.sock

The LangGraph store, history middleware, proxy, and response recorder all pick that up automatically.

function get-token() {
  curl -sSfX POST http://localhost:8081/realms/memory-service/protocol/openid-connect/token \
    -H "Content-Type: application/x-www-form-urlencoded" \
    -d "client_id=memory-service-client" \
    -d "client_secret=change-me" \
    -d "grant_type=password" \
    -d "username=bob" \
    -d "password=bob" \
    | jq -r '.access_token'
}

Start a streaming response, but disconnect before it finishes. The short timeout intentionally cuts the request before completion; resumability is verified in the next step:

curl -NsSfX POST http://localhost:9090/chat/6a4df6af-70d0-4a7c-823d-8cd4f150da84 \
  --max-time 0.2 \
  -H "Content-Type: text/plain" \
  -H "Authorization: Bearer $(get-token)" \
  -d "Write a short story about a cat."
curl -sSfX POST http://localhost:9090/v1/conversations/resume-check \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $(get-token)" \
  -d '["6a4df6af-70d0-4a7c-823d-8cd4f150da84"]'

Example output:

[
  "6a4df6af-70d0-4a7c-823d-8cd4f150da84"
]

Resume the interrupted response:

curl -NsSfX GET http://localhost:9090/v1/conversations/6a4df6af-70d0-4a7c-823d-8cd4f150da84/resume \
  --max-time 0.5 \
  -H "Authorization: Bearer $(get-token)"

Example output:

data: {"eventType":"PartialResponse","chunk":"Once "}
curl -sSfX GET http://localhost:9090/v1/conversations/6a4df6af-70d0-4a7c-823d-8cd4f150da84 \
  -H "Authorization: Bearer $(get-token)"

Example output:

{
  "id": "6a4df6af-70d0-4a7c-823d-8cd4f150da84"
}