Clustering Overview
A Stratum cluster is a group of nodes that share network state and run a VXLAN overlay mesh so that every managed network is present on every host. Shared state is replicated through Raft consensus: one node is the elected leader and accepts writes, the others are followers that replicate the log. The VXLAN overlay stretches L2 segments across physical hosts, so a workload's endpoint keeps its IP and MAC when it moves between nodes.
Both node modes participate in the same cluster. Compute nodes host workloads; Gateway nodes handle north-south traffic. A cluster can have any mix of the two.
Ports used by clustering
Cluster communication happens only on the management bridge (cnv-mgmt-br0). Open these ports between all cluster members on the management network:
| Port | Protocol | Purpose |
|---|---|---|
| 7071 | TCP (gRPC) | Node-to-node control plane |
| 7073 | TCP | Raft consensus log replication |
| 7074 | UDP | Gateway HA heartbeat |
Do not expose these ports to untrusted networks. The management bridge is separate from the workload bridge for exactly this reason — see Networking Overview.
Raft consensus
Stratum uses Raft to replicate cluster state: the list of networks, endpoint bindings, firewall rules, load balancer VIPs, and cluster membership. The Raft log lives on disk at /var/lib/cenvero-str/raft/.
Raft runs only when clustering is enabled (cluster.enabled: true) and requires mutual TLS between members — the transport refuses to run unauthenticated and there is no silent plaintext downgrade. Provision the cluster member certificates before enabling clustering.
Key properties:
- A cluster of N nodes tolerates
(N-1)/2simultaneous failures and still makes progress. - A 3-node cluster (2 Compute + 1 Gateway, for example) tolerates 1 failure.
- A 5-node cluster tolerates 2 simultaneous failures.
- A cluster of 2 nodes has no fault tolerance — losing one node stalls writes.
A 1-node deployment runs Raft in single-node mode: the agent is always the leader and there is no replication. This is fine for development and testing.
VXLAN overlay mesh
Each network you create is stretched across all cluster members via VXLAN, using real kernel VXLAN devices with the management interface as the VTEP source. When an endpoint on node A sends a frame to an endpoint on node B, the overlay encapsulates the frame in a VXLAN packet and sends it to node B's management IP. The receiving node decapsulates it and delivers it to the destination endpoint — transparently, as if both endpoints were on the same physical switch.
The overlay uses a unique VNI (VXLAN Network Identifier) per Stratum network, so networks remain isolated even though they share the same underlay. MAC-to-node mappings are maintained in the cluster's shared state and updated in real time as endpoints attach, detach, and move between nodes.
Forming a cluster
Start with one node (it becomes the bootstrap leader), then join the others one at a time:
# On each additional node
sudo cenvero-str-ctl cluster join --peer 10.0.0.11:7073
Where 10.0.0.11 is the management IP of any existing cluster member. The joining node fetches the full Raft log and applies it before participating in elections.
Check the cluster from any member:
cenvero-str-ctl cluster status
ROLE NODE-ID PEER TERM COMMIT-INDEX STATE
leader cmp-01 10.0.0.11:7073 3 1042 healthy
follower cmp-02 10.0.0.12:7073 3 1042 healthy
follower gw-01 10.0.0.13:7073 3 1042 healthy
Leader election
If the current leader becomes unreachable, followers start an election after the Raft election timeout elapses with no leader heartbeat. The node with the most up-to-date log and a majority of votes becomes the new leader. During the election window, writes are paused — existing traffic continues uninterrupted because the data plane is in-kernel and does not depend on the leader being available.
You can see the current leader and term at any time:
cenvero-str-ctl cluster status --leader
Joining and leaving
Add a new node at any time by running cluster join on it. The cluster rebalances: if the new node is a Gateway, it begins participating in HA once it has caught up with the log.
To remove a node gracefully — for maintenance or decommission:
# On the node being removed
sudo cenvero-str-ctl cluster leave
This notifies the leader, which commits a membership-change entry to the log and adjusts the quorum size. The leaving node shuts down its Raft participant cleanly. Workloads on a Compute node being removed should be moved to another node first — see Moving Workloads Between Nodes.
Do not hard-power-off a node without running cluster leave first. The cluster will continue to function (assuming quorum remains), but it will count the node as a failed member until it is explicitly removed.
Configuration in node.yaml
cluster:
enabled: true
peers:
- 10.0.0.11:7073
- 10.0.0.12:7073
- 10.0.0.13:7073
Apply with cenvero-str-ctl config apply --file node.yaml. The peers list is used for initial discovery; once the node is part of the cluster, membership is managed by the Raft log.
See also
- Gateway High Availability — redundant Gateways and sub-second failover.
- Moving Workloads Between Nodes — keeping an endpoint's IP/MAC when it moves between Compute nodes.
- Networking Overview — how the VXLAN overlay extends L2 networks.
- Configuration — the
clusterblock innode.yaml.