Skip to main content
Build persistent, stateful chat sessions that handle long-running conversations across multiple interactions and users. A user might start a conversation now, respond hours later, and return again after a few days. Multiple users may be having separate conversations going on, and a single conversation may be open in multiple browser windows. This guide helps you with implementing persistent chat sessions with Restate.

Virtual Objects

To implement stateful entities like chat sessions, or stateful agents, Restate provides the service type Virtual Objects. Each Virtual Object instance maintains isolated state and is identified by a unique key. Here is an example of a Virtual Object that represents chat sessions: Objects
export default restate.object({
  name: "Chat",
  handlers: {
    message: restate.createObjectHandler(
      { input: zodPrompt(examplePrompt) },
      async (ctx: ObjectContext, { message }: { message: string }) => {
        const messages = (await ctx.get<Array<ModelMessage>>("memory")) ?? [];
        messages.push({ role: "user", content: message });

        // Use your preferred LLM SDK here
        const result = await ctx.run("LLM call", async () => llmCall(messages));

        messages.push({ role: "assistant", content: result.text });
        ctx.set("memory", messages);

        return result.text;
      },
    ),
    getHistory: restate.createObjectSharedHandler(
      async (ctx: restate.ObjectSharedContext) =>
        ctx.get<Array<ModelMessage>>("memory"),
    ),
  },
});
View on GitHub: TS / Python This example stores the chat messages in Restate. You can also store any other K/V state, like user preferences. Virtual Objects provide:
  • Durable state: Conversation history or any other K/V state (e.g. user preferences) that persists across failures and restarts
  • Session isolation: Each chat gets isolated state with automatic concurrency control (see below
  • Works with any LLM SDK (Vercel AI, LangChain, LiteLLM, etc.) and any programming language supported by Restate (TypeScript, Python, Go, etc.).
The UI lets you query the state of each chat session: Chat session state - UI
This pattern is complementary to AI memory solutions like mem0 or graffiti. You can use Virtual Objects to enforce session concurrency and queueing while storing the agent’s memory in specialized memory systems.
This pattern is implementable with any of our SDKs and any AI SDK. If you need help with a specific SDK, please reach out to us via Discord or Slack.
1

Requirements

  • AI SDK of your choice (e.g., OpenAI, LangChain, Pydantic AI, LiteLLM, etc.) to make LLM calls.
  • API key for your model provider.
2

Download the example

git clone https://github.com/restatedev/ai-examples.git &&
cd typescript-patterns &&
npm install
3

Start the Restate Server

restate-server
4

Start the Service

Export the API key of your model provider as an environment variable and then start the agent. For example, for OpenAI:
export OPENAI_API_KEY=your_openai_api_key
npm run dev
5

Register the services

Service Registration
6

Send messages to a chat session

In the UI (http://localhost:9070), click on the message handler of the Chat service to open the playground. Enter a key for the chat session (e.g., session123) and send messages to start a conversation.Chat playground
The session state (conversation history) is automatically persisted and maintained across calls. Send additional messages with the same session ID to see how the conversation context is preserved. For example, ask to shorten the poem.
7

Check the Restate UI

In the State Tab, you can view what is stored in Restate for each chat session:Chat session state - UI

Built-in concurrency control

Restate’s Virtual Objects have built-in queuing and consistency guarantees per object key. Handlers either have read-write access (ObjectContext) or read-only access (shared object context).
  • Only one handler with write access can run at a time per object key to prevent concurrent/lost writes or race conditions (for example message()).
  • Handlers with read-only access can run concurrently to the write-access handlers (for example getHistory()).
Queue Seeing concurrency control in action: In the chat service, the message handler is an exclusive handler, while the getHistory handler is a shared handler. Let’s send some messages to a chat session:
curl localhost:8080/Chat/session123/message/send --json '{"message": "make a poem about durable execution"}' &
curl localhost:8080/Chat/session456/message/send --json '{"message": "what are the benefits of durable execution?"}' &
curl localhost:8080/Chat/session789/message/send --json '{"message": "how does workflow orchestration work?"}' &
curl localhost:8080/Chat/session123/message/send --json '{"message": "can you make it rhyme better?"}' &
curl localhost:8080/Chat/session456/message/send --json '{"message": "what about fault tolerance in distributed systems?"}' &
curl localhost:8080/Chat/session789/message/send --json '{"message": "give me a practical example"}' &
curl localhost:8080/Chat/session101/message/send --json '{"message": "explain event sourcing in simple terms"}' &
curl localhost:8080/Chat/session202/message/send --json '{"message": "what is the difference between async and sync processing?"}'
The UI shows how Restate queues the requests per session to ensure consistency. Only one chat agent is running per session ID:
Conversation State Management

Retrieving state

The state you store in Virtual Objects lives forever. If you want to resume a session, this means simply sending a new message to the same virtual object. To retrieve the state, we can add a handler which reads the state. Have a look at the getHistory/get_history handler in the example above. Call the handler to get the history:
curl localhost:8080/Chat/user123/getHistory
This is a shared handler, meaning it can only read state (not write). This allows it to run concurrently with the message handler.