Flow control is an opt-in feature and is disabled by default.
Its configuration and APIs may change in future releases.
Why flow control
Flow control gives you a lever over concurrent work, which helps with:- Cost control: Cap how much expensive work runs at once. This is especially valuable for AI agents, where each concurrent invocation can translate directly into model or API spend. A concurrency limit puts a ceiling on that cost.
- Endpoint protection: Keep a burst of invocations from overwhelming a downstream service, database, or third-party API by bounding how many hit it concurrently.
- Fairness: Invocations flow through a scheduler that decides who goes next, so Restate ensures fairness between invocations running on the same partition.
What Restate supports today
Restate’s flow control primitives are built on a scheduler that decides which invocation runs next. The first capability built on this scheduler is concurrency limits: the maximum number of invocations that may run concurrently for a given scope. More flow-control capabilities will follow in later releases, all expressed through the same scope-based model. Planned follow-ups include throttling and rate limits, invocation priorities, and finite queue (backlog) limits.Scopes
A scope is a namespace for concurrency control. Every invocation can carry a scope, and concurrency limits are applied per scope: all invocations sharing the same scope draw from the same concurrency budget. You choose what a scope represents. For example, you might scope by:- A tenant or customer, to give each one a fair share of capacity.
- A downstream dependency, to bound how many invocations hit it at once.
- A class of work, such as
checkoutorai-agent, to cap how much of it runs concurrently.
Enabling flow control
Flow control is disabled by default. Enable it in your server configuration:restate.toml
Configuring concurrency limits
Concurrency limits are defined in a cluster-wide rule book. A rule pairs a pattern, which selects the scopes it applies to, with a set of limits. The only limit available today isconcurrency: the maximum number of invocations that may run concurrently for a matching scope.
A pattern is either an exact scope or the wildcard *:
*matches every scope and acts as a default for scopes without their own rule.checkoutmatches a single, specific scope.
restate rules CLI commands.
set is idempotent: it creates a rule if it doesn’t exist, or merges into the existing values, preserving fields you don’t touch.
restate rules --help for the full set of options.
Applying concurrency limits
To make an invocation count against a scope’s concurrency limit, send it through a scoped ingress endpoint under the reserved/restate/scope/ prefix:
{key} segment for Virtual Objects and Workflows; omit it for basic Services.
For example, to invoke checkout of OrderService under the checkout scope:
concurrency and held in their queue until a slot frees up.
Invocations sent through the non-scoped endpoints (
/restate/call/... and /restate/send/...) are not subject to any scope-based limit.
See HTTP invocation for the full set of ingress endpoints.Observing flow control
When flow control is enabled, several SQL system tables let you inspect the scheduler, queues, and concurrency limits directly:| Table | What it shows |
|---|---|
sys_rules | The configured rule book: one row per rule with its pattern, concurrency limit, description, disabled flag, version, and last-modified time. |
sys_user_limits | Per-scope concurrency counters: current usage, configured limit, available capacity, and the matching rule pattern. |
sys_vqueues | One row per entry across all queue stages, with its status, attempt counters, and lifecycle timestamps. |
sys_vqueue_meta | Aggregate statistics per queue: scope, service name, per-stage entry counts, and timing averages. |
sys_scheduler | Real-time scheduler state for each queue’s head entry: queue depth, scheduler status, and what it is blocked on. |