To understand the terminology used on this page, it might be helpful to read through the architecture reference.
Overview
A Restate cluster maintains three essential types of state:- Metadata: cluster membership as well as log and partition configuration
- Logs: The Bifrost log disseminates all events and state changes to partition workers
- Partition store: Stores ongoing invocations and their journals, persisted state, timers, deployments and more, for each partition
Snapshots
Internal mechanism for cluster operations and state sharing between nodes:- Goal: Enable fast bootstrap of new nodes and support log trimming in clusters
- Scope: A snapshot of the most recent state of a specific partition, produced by a fully caught up partition processor
- When: Essential for multi-node clusters; optional for single-node deployments
Data Backups
Full copies of all data stored by Restate for disaster recovery:- Goal: Restore a Restate Server to a previous point in time
- Scope: Complete copy of the
restate-data
directory or storage volumes - When: Currently only for single-node deployments due to timing coordination challenges
Multi-node backup challenges
Multi-node backup challenges
Coordinating simultaneous backups across multiple nodes presents significant timing precision requirements. Even millisecond differences in backup timing can result in one node capturing state that’s progressed further than another, creating inconsistent snapshots across the cluster. This timing skew leads to data inconsistencies that prevent successful cluster restoration from backup.While atomically snapshotting restate-data at the volume level is still very useful as part of a broader disaster recovery and backup strategy, some manual repair work may be required to restore from such backups. There will also be some expected data loss between the latest LSN/point in time captured by the snapshot and the latest accepted/processed transaction by the cluster before it lost availability.Since tooling for automated cluster restoration is not yet available, cluster-wide full node snapshots would currently require manual intervention to repair the system back into a workable state.
When to Use Each
Use Snapshots When:
- Operating a multi-node cluster (required)
- Adding or removing nodes from a cluster
- Enabling log trimming to manage storage
- Supporting fast partition processor failover (having warm standbys ready for near-instant takeover)
- Growing the cluster or replacing completely failed nodes (newly added nodes can bootstrap from snapshots)
Use Backups When:
- Doing point-in-time recovery of a single-node deployment
Snapshots
Snapshots are essential for multi-node cluster operations, enabling efficient state sharing and log management. Snapshots are essential to support safe log trimming and fast partition fail-over to a different cluster node. Snapshots are optional for single-node deployments and required for multi-node clusters. Restate Partition Processors can be configured to periodically publish snapshots of their partition state to a shared S3-compatible object store. Snapshots serve to allow nodes that do not have an up-to-date local copy of a partition’s state to quickly start a processor for the given partition. Without snapshots, trimming the log could lead to data loss if all the nodes replicating a particular partition are lost. Additionally, starting new partition processors would require the full replay of that partition’s log which might take a long time. When partition processors successfully publish a snapshot, this is reflected in the archived log sequence number (LSN). This value is the safe point up to which Restate can trim the Bifrost log.Configuring Automatic Snapshotting
Restate clusters should always be configured with a snapshot repository to allow nodes to efficiently share partition state, and for new nodes to be added to the cluster in the future. Restate currently supports using Amazon S3 (or an API-compatible object store) as a shared snapshot repository. To set up a snapshot destination, update your server configuration as follows:restatectl
:
latest.json
file pointing to the most recent snapshot.
No additional configuration is required to enable restoring snapshots.
When partition processors first start up, and no local partition state is found, the processor will attempt to restore the latest snapshot from the repository.
This allows for efficient bootstrapping of additional partition workers.
For testing purposes, you can also use the
file://
protocol to publish snapshots to a local directory. This is mostly useful when experimenting with multi-node configurations on a single machine. The file
provider does not support conditional updates, which makes it unsuitable for potentially contended operation.Object Store endpoint and access credentials
Restate supports Amazon S3 and S3-compatible object stores. In typical server deployments to AWS, the configuration will be automatically inferred. Object store locations are specified in the form of a URL where the scheme iss3://
and the authority is the name of the bucket. Optionally, you may supply an additional path within the bucket, which will be used as a common prefix for all operations. If you need to specify a custom endpoint for S3-compatible stores, you can override the API endpoint using the aws-endpoint-url
config key.
For typical server deployments in AWS, you might not need to set region or credentials at all when using Amazon S3 beyond setting the path. Restate’s object store support uses the conventional AWS SDKs and Tools credentials discovery. We strongly recommend against using long-lived credentials in configuration. For development, you can use short-term credentials provided by a profile.
Local development with Minio
Minio is a common target while developing locally. You can configure it as follows:Local development with S3
Assuming you have a profile set up to assume a specific role granted access to your bucket, you can work with S3 directly using a configuration like:~/.aws/config
you have a profile similar to:
Log trimming and Snapshots
In a distributed environment, the shared log is the mechanism for replicating partition state among nodes. Therefore it is critical to that all cluster members can get all the relevant events recorded in the log, even newly built nodes that will join the cluster in the future. This requirement is at odds with an immutable log growing unboundedly. Snapshots enable log trimming - the process of removing older segments of the log. By default, Restate will attempt to trim logs once an hour which you can override or disable in the server configuration:If you observe repeated
Shutting partition processor down because it encountered a trim gap in the log.
errors in the Restate server log, it is an indication that a processor is failing to start up due to missing log records. To recover, you must ensure that a snapshot repository is correctly configured and accessible from the node reporting errors. You can still recover even if no snapshots were taken previously as long as there is at least one healthy node with a copy of the partition data. In that case, you must first configure the existing node(s) to publish snapshots for the affected partition(s) to a shared destination. See the Handling missing snapshots section for detailed recovery steps.Observing processor persisted state
You can userestatectl
to see the progress of partition processors with the list
subcommand:
restatectl
’s partition list output:
- Applied LSN - the latest log record record applied by this processor
- Durable LSN - the log position of the latest partition store flushed to local node storage; by default processors optimize performance by relying on Bifrost for durability and only periodically flush partition store to disk
- Archived LSN - if a snapshot repository is configured, this LSN represents the latest published snapshot; this determines the log safe trim point in multi-node clusters
Pruning the snapshot repository
Restate does not currently support pruning older snapshots from the snapshot repository. We recommend implementing an object lifecycle policy directly in the object store to manage retention.
Data Backups
Data backups are primarily used for single-node Restate deployments.What does a backup contain?
The Restate server persists both metadata (such as the details of deployed services, in-flight invocations) and data (e.g., virtual object and workflow state keys) in its data store, which is located in its base directory (by default, therestate-data
path relative to the startup working directory). Restate is configured to perform write-ahead logging with fsync enabled to ensure that effects are fully persisted before being acknowledged to participating services.
Backing up the full contents of the Restate base directory will ensure that you can recover this state in the event of a server failure. We recommend placing the data directory on fast block storage that supports atomic snapshots, such as Amazon EBS volume snapshots. Alternatively, you can stop the restate-server process, archive the base directory contents, and then restart the process. This ensures that the backup contains an atomic view of the persisted state.
In addition to the data store, you should also make sure you have a back up of the effective Restate server configuration. Be aware that this may be spread across command line arguments, environment variables, and the server configuration file.
Restoring Backups
To restore from backup, ensure the following:- Use a Restate server release that is compatible with the version that produced the data store snapshot. See the Upgrading section.
- Use an equivalent Restate server configuration. In particular, ensure that the
cluster-name
andnode-name
attributes match those of the previous Restate server operating on this data store. - Exclusive access to a data store restored from the most recent atomic snapshot of the previous Restate installation.
Restate cannot guarantee that it is the only instance of the given node. You must ensure that only one instance of any given Restate node is running when restoring the data store from a backup. Running multiple instances could lead to a “split-brain” scenario where different servers process invocations for the same set of services, causing state divergence.