Skip to main content
When an LLM decides to call multiple tools, executing them in parallel instead of sequentially can significantly reduce latency.

Use Restate’s parallelization primitives

Agent SDKs natively support parallel tool calls, but this is disabled when integrating with Restate.
Parallel tool calls that use the Restate Context can execute in a different order during replays, breaking Restate’s deterministic execution guarantees.
Instead, you use Restate’s durable execution primitives (RestatePromise.all() in TypeScript, restate.gather() in Python) to parallelize work. There are two patterns for this:
  1. With Agent SDK: use orchestrator tool: Create a single tool that internally fans out multiple steps in parallel using Restate. The agent SDK sees one tool call, but that tool runs work concurrently.
  2. With only Restate: Custom agent loop: Manage the agentic loop yourself with the Restate SDK directly. You control the tool execution step and can run all tool calls in parallel.

With Agent SDK: Use orchestrator tool

⚠️To ensure deterministic replay when using the Vercel AI with Restate, you need to set providerOptions: { openai: { parallelToolCalls: false } } for all your AI SDK Agents.To use parallel tool calls with the Vercel AI SDK, create a tool that runs multiple analyses in parallel. The LLM calls one tool, and that tool fans out work internally using durable execution primitives.Restate makes sure that all parallel tasks are retried and recovered until they succeed. If one step fails, only that step is retried while the successful results are preserved.
parallel-tools-agent.ts
const run = async (ctx: restate.Context, claim: ClaimInput) => {
  const model = wrapLanguageModel({
    model: openai("gpt-5.4"),
    middleware: durableCalls(ctx, { maxRetryAttempts: 3 }),
  });

  const { text } = await generateText({
    model,
    prompt: `Analyze the claim ${JSON.stringify(claim)}.
        Use your tools to calculate key metrics and decide whether to approve.`,
    tools: {
      calculateMetrics: tool({
        description: "Calculate claim metrics.",
        inputSchema: InsuranceClaimSchema,
        execute: async (claim: InsuranceClaim) => {
          // Execute each calculation as a parallel durable step
          return RestatePromise.all([
            ctx.run("eligibility", () => checkEligibility(claim)),
            ctx.run("cost", () => compareToStandardRates(claim)),
            ctx.run("fraud", () => checkFraud(claim)),
          ]);
        },
      }),
    },
    stopWhen: [stepCountIs(10)],
    providerOptions: { openai: { parallelToolCalls: false } },
  });
  return text;
};
Install Restate and launch it:
npm install --global @restatedev/restate-server@latest @restatedev/restate@latest
restate-server
Get the example:
restate example typescript-vercel-ai-tour-of-agents && cd typescript-vercel-ai-tour-of-agents
npm install
Export your OpenAI API key and run the agent:
export OPENAI_API_KEY=sk-...
npx tsx ./src/parallel-tools-agent.ts
Register the agents with Restate:
restate deployments register http://localhost:9080 --force --yes # dev only: overrides previous registrations
Start a request:
curl localhost:8080/restate/call/ParallelToolClaimAgent/run --json '{
    "date":"2024-10-01",
    "category":"orthopedic",
    "reason":"hospital bill for a broken leg",
    "amount":3000,
    "placeOfService":"General Hospital"
}'
In the UI, you can see the tool steps running in parallel:
Parallel tool execution trace

Only Restate: custom agent loop with parallel tool calls

When you manage the agentic loop yourself with the Restate SDK, you have full control over tool execution. After the LLM returns multiple tool calls, you start all of them concurrently and wait for all to complete before feeding results back to the LLM.
parallel-tools-agent.ts
// Define your tools as your AI SDK requires (here Vercel AI SDK)
const tools = {
  get_weather: tool({
    description: "Get the current weather for a location",
    inputSchema: z.object({ city: z.string() }),
  }),
};

async function run(ctx: Context, { message }: { message: string }) {
  const history: ModelMessage[] = [{ role: "user", content: message }];

  while (true) {
    // Use your preferred LLM SDK here
    let { text, toolCalls, messages } = await ctx.run(
      "LLM call",
      async () => llmCall(history, tools),
      { maxRetryAttempts: 3 },
    );
    history.push(...messages);

    if (!toolCalls || toolCalls.length === 0) {
      return text;
    }

    // Run all tool calls in parallel
    let toolPromises = [];
    for (let { toolCallId, toolName, input } of toolCalls) {
      const { city } = input as { city: string };
      const promise = ctx.run(`Get weather ${city}`, () => fetchWeather(city));
      toolPromises.push({ toolCallId, toolName, promise });
    }

    // Wait for all tools to complete in parallel
    await RestatePromise.all(toolPromises.map(({ promise }) => promise));

    // Append all results to messages
    for (const { toolCallId, toolName, promise } of toolPromises) {
      history.push(toolResult(toolCallId, toolName, await promise));
    }
  }
}
Install Restate and launch it:
restate-server
Get the example:
restate example typescript-restate-tour-of-agents && cd typescript-restate-tour-of-agents
npm install
Export your API key:
export OPENAI_API_KEY=sk-...
npx tsx ./src/parallel-tools-agent.ts
Register the services with Restate:
restate deployments register http://localhost:9080 --force --yes # dev only: overrides previous registrations
Send a request:
curl localhost:8080/restate/call/ParallelToolAgent/run \
  --json '{"message": "What is the weather in San Francisco and New York?"}'
The Restate UI shows how multiple tool calls execute concurrently, with all operations completing in parallel: Parallel tool execution - UI