Skip to main content
AI agents are long-running processes that combine LLMs with tools and external APIs to complete complex tasks. With Restate, you can build agents that are resilient to failures, stateful across conversations, and observable without managing complex retry logic or external state stores. In this guide, you’ll learn how to:
  • Build durable AI agents that recover automatically from crashes and API failures
  • Integrate Restate with the
  • Observe and debug agent executions with detailed traces
  • Implement resilient human-in-the-loop workflows with approvals and timeouts
  • Manage conversation history and state across multi-turn interactions
  • Orchestrate multiple agents working together on complex tasks

Getting Started

A Restate AI application has two main components:
  • Restate Server: The core engine that takes care of the orchestration and resiliency of your agents
  • Agent Services: Your agent or AI workflow logic using the Restate SDK for durability
Application Structure Restate works with how you already deploy your agents, whether that’s in Docker, on Kubernetes, or via serverless platforms (Modal, AWS Lambda…). You don’t need to run your agents in any special way. Let’s run an example locally to get a better feel for how it works.

Run the agent

Install Restate and launch it:
restate-server
Get the example:
git clone [email protected]:restatedev/ai-examples.git
cd ai-examples/openai-agents/tour-of-agents
Export your OpenAI API key and run the agent:
export OPENAI_API_KEY=sk-...
uv run .
Then, tell Restate where your agent is running via the UI (http://localhost:9070) or CLI:
restate deployments register http://localhost:9080
This registers a set of agents that we will be covering in this tutorial. To test your setup, invoke the weather agent, either via the UI playground by clicking on the run handler of the WeatherAgent in the overview:
Playground
Or via curl:
curl localhost:8080/WeatherAgent/run \
  --json '{"message": "What is the weather like in San Francisco?"}'
You should see the weather information printed in the terminal. Let’s have a look at what happened under the hood to make your agents resilient.

Durable Execution

AI agents make multiple LLM calls and tool executions that can fail due to rate limits, network issues, or service outages. Restate uses Durable Execution to make your agents withstand failures without losing progress. The Restate SDK records the steps the agent executes in a log and replays them if the process crashes or is restarted: Durable AI Agent Execution Durable Execution is the basis of how Restate makes your agents resilient to failures. Restate offers durable execution primitives via its SDK.

Creating a Durable Agent

To implement a durable agent, you can use the Restate SDK in combination with the OpenAI Agent SDK. Here’s the implementation of the durable weather agent you just invoked:
durable_agent.py
@durable_function_tool
async def get_weather(city: WeatherRequest) -> WeatherResponse:
    """Get the current weather for a given city."""
    return await restate_context().run_typed("get weather", fetch_weather, req=city)


agent = Agent(
    name="WeatherAgent",
    instructions="You are a helpful agent that provides weather updates.",
    tools=[get_weather],
)

agent_service = restate.Service("WeatherAgent")


@agent_service.handler()
async def run(_ctx: restate.Context, req: WeatherPrompt) -> str:
    result = await DurableRunner.run(agent, req.message)
    return result.final_output
First, you implement your agent and its tools, similar to how you would do it with the OpenAI Agent SDK. The main difference compared to a standard OpenAI agent is the use of the Restate Context at key points throughout the agent logic. Any action with the Context is automatically recorded by the Restate Server and survives failures. We use this for:
  1. Persisting LLM responses: We use the DurableRunner, so that every LLM response is saved in Restate Server and can be replayed during recovery. The DurableRunner is provided to us via the OpenAI extensions in the Restate SDK.
  2. Resilient tool execution: Tools can make steps durable by annotating them with @durable_function_tool and using Restate Context actions (available via restate_context() or restate_object_context()). Whenever you do an action with the Restate Context, the result gets persisted in Restate and can be recovered after a failure. The example uses restate_context().run_typed, which runs the function you provide to it and persists its result (e.g. database interaction, API calls, non-deterministic actions).
To serve the agent over HTTP with Restate, you create a Restate Service and define handlers. Here, the agent logic is called from the run handler. The endpoint that serves the agents of this tour over HTTP is defined in __main__.py. The agent can now be called at http://localhost:8080/WeatherAgent/run.
Ask for the weather in Denver:
curl localhost:8080/WeatherAgent/run \
--json '{"message": "What is the weather like in Denver?"}'
On the invocation page in the UI, click on the invocation ID of the failing invocation. You can see that your request is retrying because the weather API is down:
Invocation overview
To fix the problem, remove the line fail_on_denver from the fetch_weather function in the app/utils/utils.py file:
utils/utils.py
async def fetch_weather(req: WeatherRequest) -> WeatherResponse:
    fail_on_denver(req.city)
    weather_data = await call_weather_api(req.city)
    return parse_weather_data(weather_data)
Once you restart the service, the workflow finishes successfully.

Observing your Agent

As you saw in the previous section, the Restate UI comes in handy when monitoring and debugging your agents. The Invocations tab shows all agent executions with detailed traces of every LLM call, tool execution, and state change:
Invocation overview
Restate supports OpenTelemetry for exporting traces to external systems like Langfuse, DataDog, or Jaeger:Have a look at the tracing docs to set this up.
Now that you know how to build and debug an agent, let’s look at more advanced patterns.

Human-in-the-Loop Agent

Many AI agents need human oversight for high-risk decisions or gathering additional input. Restate makes it easy to pause agent execution and wait for human input. Benefits with Restate:
  • If the agent crashes while waiting for human input, Restate continues waiting and recovers the promise on another process.
  • If the agent runs on function-as-a-service platforms, the Restate SDK lets the function suspend while it’s waiting. Once the approval comes in, the Restate Server invokes the function again and lets it resume where it left off. This way, you don’t pay for idle waiting time (Learn more).
Here’s a tool that asks for human approval for high-value claims:
human_approval_agent.py
@durable_function_tool
async def human_approval(claim: InsuranceClaim) -> str:
    """Ask for human approval for high-value claims."""

    # Create an awakeable for human approval
    approval_id, approval_promise = restate_context().awakeable(type_hint=str)

    # Request human review
    await restate_context().run_typed(
        "Request review", request_human_review, claim=claim, awakeable_id=approval_id
    )

    # Wait for human approval
    return await approval_promise
To implement human approval steps, you can use Restate’s awakeables. An awakeable is a promise that can be resolved externally via an API call by providing its ID. When you create the awakeable, you get back an ID and a promise. You can send the ID to the human approver, and then wait for the promise to be resolved.
You can also use awakeables outside of tools, for example, to implement human approval steps in between agent iterations.
Start a request for a high-value claim that needs human approval. Use the playground or curl with /send to start the claim asynchronously, without waiting for the result.
curl localhost:8080/HumanClaimApprovalAgent/run/send \
--json '{"message": "Process my hospital bill of 3000USD for a broken leg."}'
You can restart the service to see how Restate continues waiting for the approval.If you wait for more than a minute, the invocation will get suspended.
Invocation overview
Simulate approving the claim by executing the curl request that was printed in the service logs, similar to:
curl localhost:8080/restate/awakeables/sign_1M28aqY6ZfuwBmRnmyP/resolve --json 'true'
See in the UI how the workflow resumes and finishes after the approval.
Invocation overview
Add timeouts to human approval steps to prevent workflows from hanging indefinitely.Restate persists the timer and the approval promise, so if the service crashes or is restarted, it will continue waiting with the correct remaining time:
human_approval_agent_with_timeout.py
# Wait for human approval for at most 3 hours to reach our SLA
match await restate.select(
    approval=approval_promise,
    timeout=restate_context().sleep(timedelta(hours=3)),
):
    case ["approval", approved]:
        return "Approved" if approved else "Rejected"
    case _:
        return "Approval timed out - Evaluate with AI"
Try it out by sending a request to the service:
curl localhost:8080/HumanClaimApprovalWithTimeoutsAgent/run/send \
--json '{"message": "Process my hospital bill of 3000USD for a broken leg."}'
You restart the service and check in the UI how the process will block for the remaining time without starting over.You can also lower the timeout to a few seconds to see how the timeout path is taken.

Resilient workflows as tools

You can pull out complex parts of your tool logic into separate workflows. This lets you break down complex agents into smaller, reusable components that can be developed, deployed, and scaled independently. The Restate SDK gives you clients to call these workflows durably from your agent logic. All calls are proxied via Restate. Restate persists the call and takes care of retries and recovery. For example, let’s implement the human approval tool as a separate service:
sub_workflow_agent.py
# Sub-workflow service for human approval
human_approval_workflow = restate.Service("HumanApprovalWorkflow")


@human_approval_workflow.handler()
async def review(ctx: restate.Context, claim: InsuranceClaim) -> str:
    """Request human approval for a claim and wait for response."""
    # Create an awakeable that can be resolved via HTTP
    approval_id, approval_promise = ctx.awakeable(type_hint=str)

    # Request human review
    await ctx.run_typed(
        "Request review", request_human_review, claim=claim, awakeable_id=approval_id
    )

    # Wait for human approval
    return await approval_promise
This can now be called from the main agent via a service client:
sub_workflow_agent.py
@durable_function_tool
async def human_approval(claim: InsuranceClaim) -> str:
    """Ask for human approval for high-value claims."""
    return await restate_context().service_call(review, claim)
These workflows have access to all Restate SDK features, including durable execution, state management, awakeables, and observability. They can be developed, deployed, and scaled independently.
Start a request for a high-value claim that needs human approval. Use /send to start the claim asynchronously, without waiting for the result.
curl localhost:8080/SubWorkflowClaimApprovalAgent/run/send \
--json '{"message": "Process my hospital bill of 3000USD for a broken leg."}'
In the UI, you can see that the agent called the workflow service and is waiting for the response. You can see the trace of the sub-workflow in the timeline.Once you approve the claim, the workflow returns, and the agent continues.
Invocation overview
Follow the Tour of Workflows to learn more about implementing resilient workflows with Restate.

Durable Sessions

The next ingredient we need to build AI agents is the ability to maintain context and memory across multiple interactions. To implement stateful entities like chat sessions, or stateful agents, Restate provides a special service type called Virtual Objects. When you send a message to a Virtual Object, you provide a unique key that identifies the object instance (for example, a chat session ID or user ID). Each instance of a Virtual Object maintains isolated state. The handlers of the Virtual Object can read and modify the object’s state via the Restate ObjectContext. Objects

Virtual Objects for stateful agents

Restate’s DurableRunner includes a configuration setting to automatically store your agent’s events and state in Restate. Here is an example of a stateful, durable agent represented as a Virtual Object:
chat.py
chat = VirtualObject("Chat")


@chat.handler()
async def message(_ctx: ObjectContext, req: ChatMessage) -> dict:
    # Set use_restate_session=True to store the session in Restate's key-value store
    # Make sure you use a VirtualObject to enable this feature
    result = await DurableRunner.run(
        Agent(name="Assistant", instructions="You are a helpful assistant."),
        req.message,
        use_restate_session=True,
    )
    return result.final_output
This automatically persists the agent events (LLM calls, tool calls, etc.) and the conversation history in Restate. It uses Restate as the session provider for the OpenAI Agent SDK.
Ask the agent to do some task and provide a session ID as the object key:
curl localhost:8080/Chat/session123/message \
--json '{"message": "Make a poem about durable execution."}'
Continue the conversation with the same session ID. The agent will remember previous context:
curl localhost:8080/Chat/session123/message \
--json '{"message": "Shorten it to 2 lines."}'
Go to the state tab of the UI to view the conversation history.
Virtual Objects are ideal for implementing stateful agents because they provide:
  • Long-lived state: K/V state is stored permanently. It has no automatic expiry. Clear it via ctx.clear().
  • Durable state changes: State changes are logged with Durable Execution, so they survive failures and are consistent with code execution
  • State is queryable via the state tab in the UI.
Conversation State Management

Built-in concurrency control

Restate’s Virtual Objects have built-in queuing and consistency guarantees per object key. You provide the unique key when invoking the Virtual Object, for example, the chat session ID or user ID. When multiple requests come in for the same object key, Restate automatically queues them and ensures consistency. Queue The semantics are as follows:
  • Handlers either have read-write access (ObjectContext) or read-only access (shared object context).
  • Only one handler with write access can run at a time per object key to prevent concurrent/lost writes or race conditions (for example message()).
  • Handlers with read-only access can run concurrently to the write-access handlers (for example get_history()).
Let’s send several messages concurrently to different chat sessions:
curl localhost:8080/Chat/session123/message/send --json '{"message": "make a poem about durable execution"}' &
curl localhost:8080/Chat/session456/message/send --json '{"message": "what are the benefits of durable execution?"}' &
curl localhost:8080/Chat/session789/message/send --json '{"message": "how does workflow orchestration work?"}' &
curl localhost:8080/Chat/session123/message/send --json '{"message": "can you make it rhyme better?"}' &
curl localhost:8080/Chat/session456/message/send --json '{"message": "what about fault tolerance in distributed systems?"}' &
curl localhost:8080/Chat/session789/message/send --json '{"message": "give me a practical example"}' &
curl localhost:8080/Chat/session101/message/send --json '{"message": "explain event sourcing in simple terms"}' &
curl localhost:8080/Chat/session202/message/send --json '{"message": "what is the difference between async and sync processing?"}'
The UI shows how Restate queues the requests per session to ensure consistency:
Conversation State Management
You can run Virtual Objects on serverless platforms like Modal, Render, or AWS Lambda. When the request comes in, Restate attaches the correct state to the request, so your handler can access it locally.This way, you can implement stateful, serverless agents without managing any external state store and without worrying about concurrency issues.

Virtual Objects for storing context

You can store any context information in Virtual Objects, for example, user preferences or the last agent they interacted with. Use ctx.set and ctx.get in your handler to store and retrieve state. We will show an example of this in the next section when we orchestrate multiple agents.

Resilient multi-agent coordination

As your agents grow more complex, you may want to break them down into smaller, specialized agents that can delegate tasks to each other. Similar to sub-workflows, you can break down complex agents into multiple specialized agents. All agents can run in the same process or be deployed independently.

Agents as tools/handoffs

If you want to share context between agents, run the agents in the same process and use handoffs or tools. You don’t need to do anything special to make this work with Restate. Use Virtual Object state to maintain context between runs. For example, store the last agent that was called in the object state, so the user can connect back seamlessly on the next interaction:
multi_agent.py
medical_agent = Agent(
    name="MedicalSpecialist",
    handoff_description="I handle medical insurance claims from intake to final decision.",
    instructions="Review medical claims for coverage and necessity. Approve/deny up to $50,000.",
)

car_agent = Agent(
    name="CarSpecialist",
    handoff_description="I handle car insurance claims from intake to final decision.",
    instructions="Assess car claims for liability and damage. Approve/deny up to $25,000.",
)


intake_agent = Agent(
    name="IntakeAgent",
    instructions="Route insurance claims to the appropriate specialist",
    handoffs=[medical_agent, car_agent],
)

agent_dict = {
    "IntakeAgent": intake_agent,
    "MedicalSpecialist": medical_agent,
    "AutoSpecialist": car_agent,
}

agent_service = restate.VirtualObject("MultiAgentClaimApproval")


@agent_service.handler()
async def run(ctx: restate.ObjectContext, claim: InsuranceClaim) -> str:
    # Store context in Restate's key-value store
    last_agent_name = await ctx.get("last_agent_name", type_hint=str) or "IntakeAgent"
    last_agent = agent_dict.get(last_agent_name, intake_agent)

    result = await DurableRunner.run(
        last_agent, f"Claim: {claim.model_dump_json()}", session=RestateSession()
    )

    ctx.set("last_agent_name", result.last_agent.name)
    return result.final_output
The execution trace in the Restate UI will allow you to see the full chain of calls between agents and their individual steps.
Start a request for a claim that needs to be analyzed by multiple agents.
curl localhost:8080/MultiAgentClaimApproval/session123/run --json '{
    "date":"2024-10-01",
    "category":"orthopedic",
    "reason":"hospital bill for a broken leg",
    "amount":3000,
    "placeOfService":"General Hospital"
}'
In the UI, you can see that the agent called the sub-agents and is waiting for their responses. You can see the trace of the sub-agents in the timeline.Once all sub-agents return, the main agent continues and makes a decision.
Invocation overview
The state now contains the last agent that was called, so you can continue the conversation directly with the same agent:
Invocation overview

Remote agents as tools

If you want to run agents independently, for example, to scale them separately, run them on different platforms, or let them get developed by different teams, then you can call them as tools via service calls. Restate will proxy all calls, persist them, and will guarantee that they complete successfully. Your main agent can suspend and save resources while waiting for the remote agent to finish. Restate invokes your main agent again once the remote agent returns.
multi_agent.py
# Durable service call to the fraud agent; persisted and retried by Restate
@durable_function_tool
async def check_fraud(claim: InsuranceClaim) -> str:
    """Analyze the probability of fraud."""
    return await restate_context().service_call(run_fraud_agent, claim)


agent = Agent(
    name="ClaimApprovalCoordinator",
    instructions="You are a claim approval engine. Analyze the claim and use your tools to decide whether to approve it.",
    tools=[check_eligibility, check_fraud],
)

agent_service = restate.Service("RemoteMultiAgentClaimApproval")


@agent_service.handler()
async def run(_ctx: restate.Context, claim: InsuranceClaim) -> str:
    result = await DurableRunner.run(agent, f"Claim: {claim.model_dump_json()}")
    return result.final_output
Note, any shared context between agents needs to be passed explicitly via the input. The execution trace in the Restate UI will allow you to see the full chain of calls between agents and their individual steps.
Start a request for a claim that needs to be analyzed by multiple agents.
curl localhost:8080/RemoteMultiAgentClaimApproval/run --json '{
    "date":"2024-10-01",
    "category":"orthopedic",
    "reason":"hospital bill for a broken leg",
    "amount":3000,
    "placeOfService":"General Hospital"
}'
In the UI, you can see that the agent called the sub-agents and is waiting for their responses. You can see the trace of the sub-agents in the timeline.Once all sub-agents return, the main agent continues and makes a decision.
Invocation overview
You cannot put both agents within the same Virtual Object, because this leads to deadlocks. The main agent would block on the call to the sub-agent, preventing the sub-agent from executing, cause only one handler can run at a time per object key.

Parallel Work

Now that our agents are broken down into smaller parts, let’s have a look at how to run different parts of our agent logic in parallel to speed up execution. Restate provides primitives that allow you to run tasks concurrently while maintaining deterministic execution during replays. Most actions on the Restate Context can be composed using restate.gather to gather their results or restate.select to wait for the first one to complete.

Parallel Tool Steps

When using the OpenAI SDK with Restate, tool calls are forced to be executed sequentially to ensure deterministic execution during replays. When multiple tools execute in parallel and use the Restate Context, the order of operations might differ between the original execution and the replay, leading to inconsistencies. Therefore, the only way to run multiple tool steps in parallel is to implement an orchestrator tool that uses durable execution to run multiple steps in parallel. Here is an insurance claim agent tool that runs multiple analyses in parallel:
parallel_tools_agent.py
@durable_function_tool
async def calculate_metrics(claim: InsuranceClaim) -> list[str]:
    """Calculate claim metrics."""
    ctx = restate_context()

    # Run tools/steps in parallel with durable execution
    results_done = await restate.gather(
        ctx.run_typed("eligibility", check_eligibility, claim=claim),
        ctx.run_typed("cost", compare_to_standard_rates, claim=claim),
        ctx.run_typed("fraud", check_fraud, claim=claim),
    )
    return [await result for result in results_done]
Restate makes sure that all parallel tasks are retried and recovered until they succeed.
Start a request for a claim that needs to be analyzed by multiple tools in parallel.
curl localhost:8080/ParallelToolClaimAgent/run --json '{
    "date":"2024-10-01",
    "category":"orthopedic",
    "reason":"hospital bill for a broken leg",
    "amount":3000,
    "placeOfService":"General Hospital"
}'
In the UI, you can see that the agent ran the tool steps in parallel. Their traces all start at the same time.Once all tools return, the agent continues and makes a decision.
Invocation overview

Parallel Agents

You can use the same durable execution primitives to run multiple agents in parallel. For example, to race agents against each other and use the first result that returns, while cancelling the others. Or to let a main orchestrator agent combine the results of multiple specialized agents in parallel:
parallel_agents.py
@agent_service.handler()
async def run(restate_context: restate.Context, claim: InsuranceClaim) -> str:
    # Start multiple agents in parallel with auto retries and recovery
    eligibility = restate_context.service_call(run_eligibility_agent, claim)
    cost = restate_context.service_call(run_rate_comparison_agent, claim)
    fraud = restate_context.service_call(run_fraud_agent, claim)

    # Wait for all responses
    await restate.gather(eligibility, cost, fraud)

    # Run decision agent on outputs
    result = await DurableRunner.run(
        Agent(
            name="ClaimApprovalAgent", instructions="You are a claim decision engine."
        ),
        input=f"Decide about claim: {claim.model_dump_json()}. "
        "Base your decision on the following analyses:"
        f"Eligibility: {await eligibility} Cost {await cost} Fraud: {await fraud}",
    )
    return result.final_output
Start a request for a claim that needs to be analyzed by multiple agents in parallel.
curl localhost:8080/ParallelAgentClaimApproval/run --json '{
    "date":"2024-10-01",
    "category":"orthopedic",
    "reason":"hospital bill for a broken leg",
    "amount":3000,
    "placeOfService":"General Hospital"
}'
In the UI, you can see that the handler called the sub-agents in parallel. Once all sub-agents return, the main agent makes a decision.
Invocation overview

Error Handling

LLM calls are costly, so you can configure retry behavior in both Restate and your AI SDK to avoid infinite loops and high costs. Restate distinguishes between two types of errors:
  • Transient errors: Temporary issues like network failures or rate limits. Restate automatically retries these until they succeed or the retry policy is exhausted.
  • Terminal errors: Permanent failures like invalid input or business rule violations. Restate does not retry these. The invocation fails permanently. You can catch these errors and handle them gracefully.
You can throw a terminal error via:
from restate import TerminalError

raise TerminalError("This tool is not allowed to run for this input.")
You can catch and handle terminal errors in your agent logic if needed. Many AI SDKs also have their own retry behavior for LLM calls and tool executions, so let’s look at how these interact.

Retries of LLM calls

Restate’s DurableRunner lets you specify the retry behavior for LLM calls as follows:
error_handling.py
try:
    result = await DurableRunner.run(
        agent,
        req.message,
        llm_retry_opts=LlmRetryOpts(
            max_attempts=3, initial_retry_interval=timedelta(seconds=2)
        ),
    )
except restate.TerminalError as e:
    # Handle terminal errors gracefully
    return f"The agent couldn't complete the request: {e.message}"
By default, the runner retries ten times with an initial interval of one second. Once Restate’s retries are exhausted, the invocation fails with a TerminalError and won’t be retried further.

Tool execution errors

By default, the Restate OpenAI integration will raise any terminal errors in tool executions and will let you handle them in your handler, similar to what we did above for model calls. If you use Restate Context actions in your tool execution, Restate retries any transient errors in these actions until they succeed. So for all operations that might suffer from transient errors (like network calls, database interactions, etc.), you should use Context actions to make them resilient.
You can set custom retry policies for .run actions in your tool executions.
The OpenAI Agent SDK also allows setting failure_error_function to None, which will rethrow any error in the agent execution as-is. Also for example invalid LLM responses (e.g. tool call with invalid arguments or to a tool that doesn’t exist). The error will then lead to Restate retries. Restate will recover the invocation by replaying the journal entries. This can lead to infinite retries if the error is not transient. Therefore, be careful when using this option and handle errors appropriately in your agent logic. You also might want to set a retry policy at the service or handler level to avoid infinite retries.

Advanced patterns

If you need more control over the agent loop, you can implement it manually using Restate’s durable primitives.This allows you to:
  • Parallelize tool calls with restate.select and restate.gather
  • Implement custom stopping conditions
  • Implement custom logic between steps (e.g. human approval)
  • Interact with external systems between steps
  • Handle errors in a custom way
Learn more from the composable patterns guides.
Sometimes you need to undo previous agent actions when a later step fails. Restate makes it easy to implement compensation patterns (Sagas) for AI agents.Just track the rollback actions as you go, let the agent raise terminal tool errors, and execute the rollback actions in reverse order.Here is an example of a travel booking agent that first reserves a hotel, flight and car, and then either confirms them or rolls back if any step fails with a terminal error (e.g. car type not available).We let tools add rollback actions to the agent context for each booking step the do. The run handler catches any terminal errors and runs all the rollback actions.
advanced/rollback_agent.py
class BookingContext(BaseModel):
    model_config = ConfigDict(arbitrary_types_allowed=True)
    booking_id: str
    on_rollback: list[Callable] = Field(default=[])


# Functions raise terminal errors instead of feeding them back to the agent
@durable_function_tool
async def book_hotel(
    wrapper: RunContextWrapper[BookingContext], booking: HotelBooking
) -> BookingResult:
    """Book a hotel"""
    ctx = restate_context()
    booking_ctx, booking_id = wrapper.context, wrapper.context.booking_id
    # Register a rollback action for each step, in case of failures further on in the workflow
    booking_ctx.on_rollback.append(
        lambda: ctx.run_typed("Cancel hotel", cancel_hotel, id=booking_id)
    )

    # Execute the workflow step
    return await ctx.run_typed(
        "Book hotel", reserve_hotel, id=booking_id, booking=booking
    )


@durable_function_tool
async def book_flight(
    wrapper: RunContextWrapper[BookingContext], booking: FlightBooking
) -> BookingResult:
    """Book a flight"""
    ctx = restate_context()
    booking_ctx, booking_id = wrapper.context, wrapper.context.booking_id
    booking_ctx.on_rollback.append(
        lambda: ctx.run_typed("Cancel flight", cancel_flight, id=booking_id)
    )
    return await ctx.run_typed(
        "Book flight", reserve_flight, id=booking_id, booking=booking
    )


# ... Do the same for cars ...


agent = Agent[BookingContext](
    name="BookingWithRollbackAgent",
    instructions="Book a complete travel package with the requirements in the prompt."
    "Use tools to first book the hotel, then the flight.",
    tools=[book_hotel, book_flight],
)


agent_service = restate.Service("BookingWithRollbackAgent")


@agent_service.handler()
async def book(_ctx: restate.Context, req: BookingPrompt) -> str:
    booking_ctx = BookingContext(booking_id=req.booking_id)
    try:
        result = await DurableRunner.run(agent, req.message, context=booking_ctx)
    except restate.TerminalError as e:
        # Run all the rollback actions on terminal errors
        for compensation in reversed(booking_ctx.on_rollback):
            await compensation()
        raise e

    return result.final_output
Follow the instructions in the readme to try it out, and see how the agent rolls back previous bookings if a later step fails:
Invocation overview
Restate supports implementing scheduling and timer logic in your agents. This allows you to build agents that run periodically, wait for specific times, or implement complex scheduling logic. Agents can either be long-running or reschedule themselves for later execution.Have a look at the scheduling docs to learn more.
Have a look at the pub-sub example.
Have a look at the interruptible coding agent.

Summary

Durable Execution, paired with your existing SDKs, gives your agents a powerful upgrade:
  • Durable Execution: Automatic recovery from failures without losing progress
  • Persistent memory and context: Persistent conversation history and context
  • Observability by default across your agents and workflows
  • Human-in-the-Loop: Seamless approval workflows with timeouts
  • Multi-Agent Coordination: Reliable orchestration of specialized agents
  • Suspensions to save costs on function-as-a-service platforms when agents need to wait
  • Advanced Patterns: Real-time progress updates, interruptions, and long-running workflows
Consult the Restate AI Agents documentation to learn more about building agents with Restate