Skip to main content
Execute multiple AI approaches or strategies simultaneously and return the result from whichever completes first successfully. This pattern is ideal when you have multiple ways to solve the same problem and want to minimize latency by racing them against each other. Useful for:
  • Querying multiple AI models (e.g., GPT-4, Claude, Gemini) and returning the fastest response
  • Running different agents, prompts or strategies in parallel and using the first successful outcome

How does Restate help?

The benefits of using Restate for competitive racing patterns are:
  • Durable coordination: Restate turns Promises/Futures into durable, distributed constructs that persist across failures and process restarts. Race multiple approaches and return the first successful result.
  • Cancel slow tasks: Failed or slower approaches can be cancelled, preventing resource waste
  • Serverless scaling: Deploy racing strategies on serverless infrastructure for automatic scaling while the main process remains suspended
  • Works with any LLM SDK (Vercel AI, LangChain, LiteLLM, etc.) and any programming language supported by Restate (TypeScript, Python, Go, etc.).

Example

Select your preferred SDK: When you need a quick response and have access to multiple AI models, race them against each other to get the fastest result:
racing-agents.ts
async function run(
  ctx: Context,
  { message }: { message: string },
): Promise<string> {
  // Start both service calls concurrently
  const slowCall = ctx.serviceClient(racingAgent).thinkLonger({ message });
  const slowResponse = slowCall.map((res) => ({ tag: "slow", res }));

  const fastCall = ctx.serviceClient(racingAgent).respondQuickly({ message });
  const fastResponse = fastCall.map((res) => ({ tag: "fast", res }));

  const pending = [slowResponse, fastResponse];

  // Wait for the first one to complete
  const { tag, res } = await RestatePromise.any(pending);

  if (tag === "fast") {
    console.log("Quick response won the race!");
    const slowInvocationId = await slowCall.invocationId;
    ctx.cancel(slowInvocationId);
  } else {
    console.log("Deep analysis won the race!");
    const quickInvocationId = await fastCall.invocationId;
    ctx.cancel(quickInvocationId);
  }

  return res ?? "LLM gave no response";
}
Install Restate and launch it:
restate-server
Get the example:
restate example typescript-restate-tour-of-agents && cd typescript-restate-tour-of-agents
npm install
Export your API key:
export OPENAI_API_KEY=sk-...
npx tsx ./src/racing-agents.ts
Register the services with Restate:
restate deployments register http://localhost:9080 --force --yes # dev only: overrides previous registrations
Send a request:
curl localhost:8080/restate/call/RacingAgent/run \
--json '{
    "message": "What is the best approach to learn machine learning?"
}'
This pattern is implementable with any of our SDKs and any AI SDK. If you need help with a specific SDK, please reach out to us via Discord or Slack.