Parallelizing Tools and Agents

Execute multiple AI tools or agent tasks simultaneously to improve performance and efficiency. Restate ensures that all parallel operations are durably logged and automatically coordinated, enabling recovery from failures while maintaining consistency across concurrent executions.

How does Restate help?

The benefits of using Restate for parallel agent and tool execution are:

Guaranteed execution: Restate lets you schedule tasks asynchronously and guarantees that all tasks will run, with retries and recovery on failures
Durable coordination: Restate turns Promises/Futures into durable, distributed constructs that are persisted in Restate and can be recovered and awaited on another process
Serverless scaling: You can deploy the subtask executors on serverless infrastructure, like AWS Lambda, to let them scale automatically. The main task, that is idle while waiting on the subtasks, gets suspended until it can make progress
Independent failure handling: Failed operations are automatically retried without affecting successful ones
Works with any LLM SDK (Vercel AI, LangChain, LiteLLM, etc.) and any programming language supported by Restate (TypeScript, Python, Go, etc.).

Parallelizing tool calls

When an LLM decides to call multiple tools, you can execute all tool calls in parallel instead of sequentially. This significantly reduces latency when tools are independent. Wrap tool executions in ctx.run() to ensure durability, and use RestatePromise.all() (TypeScript) or restate.gather() (Python) to coordinate parallel execution.

// Define your tools as your AI SDK requires (here Vercel AI SDK)
const tools = {
  get_weather: tool({
    description: "Get the current weather for a location",
    inputSchema: z.object({ city: z.string() }),
  }),
};

async function run(ctx: Context, { message }: { message: string }) {
  const history: ModelMessage[] = [{ role: "user", content: message }];

  while (true) {
    // Use your preferred LLM SDK here
    let { text, toolCalls, messages } = await ctx.run(
      "LLM call",
      async () => llmCall(history, tools),
      { maxRetryAttempts: 3 },
    );
    history.push(...messages);

    if (!toolCalls || toolCalls.length === 0) {
      return text;
    }

    // Run all tool calls in parallel
    let toolPromises = [];
    for (let { toolCallId, toolName, input } of toolCalls) {
      const { city } = input as { city: string };
      const promise = ctx.run(`Get weather ${city}`, () => fetchWeather(city));
      toolPromises.push({ toolCallId, toolName, promise });
    }

    // Wait for all tools to complete in parallel
    await RestatePromise.all(toolPromises.map(({ promise }) => promise));

    // Append all results to messages
    for (const { toolCallId, toolName, promise } of toolPromises) {
      messages.push(toolResult(toolCallId, toolName, await promise));
    }
  }
}

When you run the example below, you can see how multiple tool calls are executed in parallel, significantly reducing the total execution time compared to sequential processing:

Run the example

Requirements

AI SDK of your choice (e.g., OpenAI, LangChain, Pydantic AI, LiteLLM, etc.) to make LLM calls.
API key for your model provider.

Download the example

git clone https://github.com/restatedev/ai-examples.git &&
cd typescript-patterns &&
npm install

Start the Restate Server

restate-server

Start the Service

Export the API key of your model provider as an environment variable and then start the agent. For example, for OpenAI:

export OPENAI_API_KEY=your_openai_api_key
npm run dev

Send a request

In the UI (http://localhost:9070), click on the run handler of the ParallelToolAgent service to open the playground and send a request that triggers multiple tool calls:

Check the Restate UI

In the UI, you can see how multiple tool calls are executed concurrently, with all operations completing in parallel:

Parallelizing agents

Execute multiple independent agents simultaneously, such as analyzing different aspects of the same input or processing multiple requests concurrently. This pattern is useful when you need to perform multiple analysis tasks that don’t depend on each other, like sentiment analysis, key point extraction, and summarization.

async function analyze(ctx: Context, { message }: { message: string }) {
  // Create parallel tasks - each runs independently
  const tasks = [
    ctx.run(
      "Analyze sentiment",
      // Use your preferred LLM SDK here
      async () => llmCall(`Analyze sentiment: ${message}`),
      { maxRetryAttempts: 3 },
    ),
    ctx.run(
      "Extract key points",
      async () => llmCall(`Extract 3 key points as bullets: ${message}`),
      { maxRetryAttempts: 3 },
    ),
    ctx.run(
      "Summarize",
      async () => llmCall(`Summarize in one sentence: ${message}`),
      { maxRetryAttempts: 3 },
    ),
  ];

  // Wait for all tasks to complete and return the results
  const results = await RestatePromise.all(tasks);
  return results.map((res) => res.text);
}

In the Restate UI, you can see how multiple analysis tasks are executed concurrently, each with independent retry policies and failure handling:

Run the example

Requirements

AI SDK of your choice (e.g., OpenAI, LangChain, Pydantic AI, LiteLLM, etc.) to make LLM calls.
API key for your model provider.

Download the example

git clone https://github.com/restatedev/ai-examples.git &&
cd typescript-patterns &&
npm install

Start the Restate Server

restate-server

Start the Service

Export the API key of your model provider as an environment variable and then start the agent. For example, for OpenAI:

export OPENAI_API_KEY=your_openai_api_key
npm run dev

Send a request

In the UI (http://localhost:9070), click on the analyze_text handler of the ParallelAgentsService service to open the playground and send a text for analysis:

Check the Restate UI

In the UI, you can see how multiple analysis tasks are executed in parallel, with each task having its own execution trace and retry policy:

LLM & Agent SDK Integrations

Recipes

Parallelizing Tools and Agents

How does Restate help?

Parallelizing tool calls

Parallelizing agents

LLM & Agent SDK Integrations

Recipes

​How does Restate help?

​Parallelizing tool calls

​Parallelizing agents

How does Restate help?

Parallelizing tool calls

Parallelizing agents