How does Restate help?
The benefits of using Restate for parallel agent and tool execution are:- Guaranteed execution: Restate lets you schedule tasks asynchronously and guarantees that all tasks will run, with retries and recovery on failures
- Durable coordination: Restate turns Promises/Futures into durable, distributed constructs that are persisted in Restate and can be recovered and awaited on another process
- Serverless scaling: You can deploy the subtask executors on serverless infrastructure, like AWS Lambda, to let them scale automatically. The main task, that is idle while waiting on the subtasks, gets suspended until it can make progress
- Independent failure handling: Failed operations are automatically retried without affecting successful ones
- Works with any LLM SDK (Vercel AI, LangChain, LiteLLM, etc.) and any programming language supported by Restate (TypeScript, Python, Go, etc.).
Parallelizing tool calls
When an LLM decides to call multiple tools, you can execute all tool calls in parallel instead of sequentially. This significantly reduces latency when tools are independent. Wrap tool executions inctx.run() to ensure durability, and use RestatePromise.all() (TypeScript) or restate.gather() (Python) to coordinate parallel execution.

Run the example
Run the example
1
Requirements
- AI SDK of your choice (e.g., OpenAI, LangChain, Pydantic AI, LiteLLM, etc.) to make LLM calls.
- API key for your model provider.
2
Download the example
3
Start the Restate Server
4
Start the Service
Export the API key of your model provider as an environment variable and then start the agent. For example, for OpenAI:
5
Register the services
- UI
- CLI

6
Send a request
In the UI (
http://localhost:9070), click on the run handler of the ParallelToolAgent service to open the playground and send a request that triggers multiple tool calls:
7
Check the Restate UI
In the UI, you can see how multiple tool calls are executed concurrently, with all operations completing in parallel:

Parallelizing agents
Execute multiple independent agents simultaneously, such as analyzing different aspects of the same input or processing multiple requests concurrently. This pattern is useful when you need to perform multiple analysis tasks that don’t depend on each other, like sentiment analysis, key point extraction, and summarization.
Run the example
Run the example
1
Requirements
- AI SDK of your choice (e.g., OpenAI, LangChain, Pydantic AI, LiteLLM, etc.) to make LLM calls.
- API key for your model provider.
2
Download the example
3
Start the Restate Server
4
Start the Service
Export the API key of your model provider as an environment variable and then start the agent. For example, for OpenAI:
5
Register the services
- UI
- CLI

6
Send a request
In the UI (
http://localhost:9070), click on the analyze_text handler of the ParallelAgentsService service to open the playground and send a text for analysis:
7
Check the Restate UI
In the UI, you can see how multiple analysis tasks are executed in parallel, with each task having its own execution trace and retry policy:
