15.3 Updating a StatefulSet
In addition to declarative scaling, StatefulSets also provide declarative updates, similar to Deployments. When you update the Pod template in a StatefulSet, the controller recreates the Pods with the updated template.
You may recall that the Deployment controller can perform the update in two ways, depending on the strategy specified in the Deployment object. You can also specify the update strategy in the updateStrategy
field in the spec
section of the StatefulSet manifest, but the available strategies are different from those in a Deployment, as you can see in the following table.
Table 15.2 The supported StatefulSet update strategies
Value | Description |
---|---|
RollingUpdate | In this update strategy, the Pods are replaced one by one. The Pod with the highest ordinal number is deleted first and replaced with a Pod created with the new template. When this new Pod is ready, the Pod with the next highest ordinal number is replaced. The process continues until all Pods have been replaced. This is the default strategy. |
OnDelete | The StatefulSet controller waits for each Pod to be manually deleted. When you delete the Pod, the controller replaces it with a Pod created with the new template. With this strategy, you can replace Pods in any order and at any rate. |
The following figure shows how the Pods are updated over time for each update strategy.
Figure 15.8 How the Pods are updated over time with different update strategies
The RollingUpdate
strategy, which you can find in both Deployments and StatefulSets, is similar between the two objects, but differs in the parameters you can set. The OnDelete
strategy lets you replace Pods at your own pace and in any order. It’s different from the Recreate
strategy found in Deployments, which automatically deletes and replaces all Pods at once.
15.3.1 Using the RollingUpdate strategy
The RollingUpdate strategy in a StatefulSet behaves similarly to the RollingUpdate strategy in Deployments, but only one Pod is replaced at a time. You may recall that you can configure the Deployment to replace multiple Pods at once using the maxSurge
and maxUnavailable
parameters. The rolling update strategy in StatefulSets has no such parameters.
You may also recall that you can slow down the rollout in a Deployment by setting the minReadySeconds
field, which causes the controller to wait a certain amount of time after the new Pods are ready before replacing the other Pods. You’ve already learned that StatefulSets also provide this field and that it affects the scaling of StatefulSets in addition to the updates.
Let’s update the quiz-api
container in the quiz
StatefulSet to version 0.2
. Since RollingUpdate
is the default update strategy type, you can omit the updateStrategy
field in the manifest. To trigger the update, use kubectl edit
to change the value of the ver
label and the image tag in the quiz-api
container to 0.2
. You can also apply the manifest file sts.quiz.0.2.yaml
with kubectl apply
instead.
You can track the rollout with the kubectl rollout status
command as in the previous chapter. The full command and its output are as follows:
$ kubectl rollout status sts quiz
Waiting for partitioned roll out to finish: 0 out of 3 new pods have been updated...
Waiting for 1 pods to be ready...
Waiting for partitioned roll out to finish: 1 out of 3 new pods have been updated...
Waiting for 1 pods to be ready...
...
Because the Pods are replaced one at a time and the controller waits until each replica is ready before moving on to the next, the quiz
Service remains accessible throughout the process. If you list the Pods as they’re updated, you’ll see that the Pod with the highest ordinal number, quiz-2
, is updated first, followed by quiz-1
, as shown here:
$ kubectl get pods -l app=quiz -L controller-revision-hash,ver
NAME READY STATUS RESTARTS AGE CONTROLLER-REVISION-HASH VER
quiz-0 2/2 Running 0 50m quiz-6c48bdd8df 0.1
quiz-1 2/2 Terminating 0 10m quiz-6c48bdd8df 0.1
quiz-2 2/2 Running 0 20s quiz-6945968d9 0.2
The update process is complete when the Pod with the lowest ordinal number, quiz-0
, is updated. At this point, the kubectl rollout status
command reports the following status:
$ kubectl rollout status sts quiz
partitioned roll out complete: 3 new pods have been updated...
Updates with Pods that aren’t ready
If the StatefulSet is configured with the RollingUpdate
strategy and you trigger the update when not all Pods are ready, the rollout is held back. The kubectl rollout status
indicates that the controller is waiting for one or more Pods to be ready.
If a new Pod fails to become ready during the update, the update is also paused, just like a Deployment update. The rollout will resume when the Pod is ready again. So, if you deploy a faulty version whose readiness probe never succeeds, the update will be blocked after the first Pod is replaced. If the number of replicas in the StatefulSet is sufficient, the service provided by the Pods in the StatefulSet is unaffected.
Displaying the revision history
You may recall that Deployments keep a history of recent revisions. Each revision is represented by the ReplicaSet that the Deployment controller created when that revision was active. StatefulSets also keep a revision history. You can use the kubectl rollout history
command to display it as follows.
$ kubectl rollout history sts quiz
statefulset.apps/quiz
REVISION CHANGE-CAUSE
1 <none>
2 <none>
You may wonder where this history is stored, because unlike Deployments, a StatefulSet manages Pods directly. And if you look at the object manifest of the quiz
StatefulSet, you’ll notice that it only contains the current Pod template and no previous revisions. So where is the revision history of the StatefulSet stored?
The revision history of StatefulSets and DaemonSets, which you’ll learn about in the next chapter, is stored in ControllerRevision objects. A ControllerRevision is a generic object that represents an immutable snapshot of the state of an object at a particular point in time. You can list ControllerRevision objects as follows:
$ kubectl get controllerrevisions
NAME CONTROLLER REVISION AGE
quiz-6945968d9 statefulset.apps/quiz 2 1m
quiz-6c48bdd8df statefulset.apps/quiz 1 50m
Since these objects are used internally, you don’t need to know anything more about them. However, if you want to learn more, you can use the kubectl explain
command.
Rolling back to a previous revision
If you’re updating the StatefulSet and the rollout hangs, or if the rollout was successful, but you want to revert to the previous revision, you can use the kubectl rollout undo
command, as described in the previous chapter. You’ll update the quiz
StatefulSet again in the next section, so please reset it to the previous version as follows:
$ kubectl rollout undo sts quiz
statefulset.apps/quiz rolled back
You can also use the --to-revision
option to return to a specific revision. As with Deployments, Pods are rolled back using the update strategy configured in the StatefulSet. If the strategy is RollingUpdate
, the Pods are reverted one at a time.
15.3.2 RollingUpdate with partition
StatefulSets don’t have a pause field that you can use to prevent a Deployment rollout from being triggered, or to pause
it halfway. If you try to pause the StatefulSet with the kubectl rollout pause
command, you receive the following error message:
$ kubectl rollout pause sts quiz
error: statefulsets.apps "quiz" pausing is not supported
In a StatefulSet you can achieve the same result and more with the partition
parameter of the RollingUpdate
strategy. The value of this field specifies the ordinal number at which the StatefulSet should be partitioned. As shown in the following figure, pods with an ordinal number lower than the partition
value aren’t updated.
Figure 15.9 Partitioning a rolling update
If you set the partition
value appropriately, you can implement a Canary deployment, control the rollout manually, or stage an update instead of triggering it immediately.
Staging an update
To stage a StatefulSet update without actually triggering it, set the partition value to the number of replicas or higher, as in the manifest file sts.quiz.0.2.partition.yaml
shown in the following listing.
Listing 15.7 Staging a StatefulSet update with the partition field
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: quiz
spec:
updateStrategy:
type: RollingUpdate
rollingUpdate:
partition: 3
replicas: 3
...
Apply this manifest file and confirm that the rollout doesn’t start even though the Pod template has been updated. If you set the partition
value this way, you can make several changes to the StatefulSet without triggering the rollout. Now let’s look at how you can trigger the update of a single Pod.
Deploying a canary
To deploy a canary, set the partition
value to the number of replicas minus one. Since the quiz
StatefulSet has three replicas, you set the partition
to 2
. You can do this with the kubectl patch
command as follows:
$ kubectl patch sts quiz -p '{"spec": {"updateStrategy": {"rollingUpdate": {"partition": 2 }}}}'
statefulset.apps/quiz patched
If you now look at the list of quiz
Pods, you’ll see that only the Pod quiz-2
has been updated to version 0.2
$ kubectl get pods -l app=quiz -L controller-revision-hash,ver
NAME READY STATUS RESTARTS AGE CONTROLLER-REVISION-HASH VER
quiz-0 2/2 Running 0 8m quiz-6c48bdd8df 0.1
quiz-1 2/2 Running 0 8m quiz-6c48bdd8df 0.1
quiz-2 2/2 Running 0 20s quiz-6945968d9 0.2
The Pod quiz-2
is the canary that you use to check if the new version behaves as expected before rolling out the changes to the remaining Pods.
At this point I’d like to draw your attention to the status
section of the StatefulSet object. It contains information about the total number of replicas, the number of replicas that are ready and available, the number of current and updated replicas, and their revision hashes. To display the status, run the following command:
$ kubectl get sts quiz -o yaml
...
status:
availableReplicas: 3
collisionCount: 0
currentReplicas: 2
currentRevision: quiz-6c48bdd8df
observedGeneration: 8
readyReplicas: 3
replicas: 3
updateRevision: quiz-6945968d9
updatedReplicas: 1
As you can see from the status
, the StatefulSet is now split into two partitions. If a Pod is deleted at this time, the StatefulSet controller will create it with the correct template. For example, if you delete one of the Pods with version 0.1, the replacement Pod will be created with the previous template and will run again with version 0.1. If you delete the Pod that’s already been updated, it’ll be recreated with the new template. Feel free to try this out for yourself. You can’t break anything.
Completing a partitioned update
When you’re confident the canary is fine, you can let the StatefulSet update the remaining pods by setting the partition
value to zero as follows:
$ kubectl patch sts quiz -p '{"spec": {"updateStrategy": {"rollingUpdate": {"partition": 0 }}}}'
statefulset.apps/quiz patched
When the partition
field is set to zero, the StatefulSet updates all Pods. First, the pod quiz-1
is updated, followed by quiz-0
. If you had more Pods, you could also use the partition
field to update the StatefulSet in phases. In each phase, you decide how many Pods you want to update and set the partition
value accordingly.
At the time of writing, partition
is the only parameter of the RollingUpdate strategy. You’ve seen how you can use it to control the rollout. If you want even more control, you can use the OnDelete
strategy, which I’ll try next. Before you continue, please reset the StatefulSet to the previous revision as follows:
$ kubectl rollout undo sts quiz
statefulset.apps/quiz rolled back
15.3.3 OnDelete strategy
If you want to have full control over the rollout process, you can use the OnDelete
update strategy. To configure the StatefulSet with this strategy, use kubectl apply
to apply the manifest file sts.quiz.0.2.onDelete.yaml
. The following listing shows how the update strategy is set.
Listing 15.8 Setting the OnDelete update strategy
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: quiz
spec:
updateStrategy:
type: OnDelete
...
This manifest updates the quiz-api
container in the Pod template to use the :0.2
image tag. However, because it sets the update strategy to OnDelete
, nothing happens when you apply the manifest.
If you use the OnDelete
strategy, the rollout is semi-automatic. You manually delete each Pod, and the StatefulSet controller then creates the replacement Pod with the new template. With this strategy, you can decide which Pod to update and when. You don’t necessarily have to delete the Pod with the highest ordinal number first. Try deleting the Pod quiz-0
. When its containers exit, a new quiz-0
Pod with version 0.2
appears:
$ kubectl get pods -l app=quiz -L controller-revision-hash,ver
NAME READY STATUS RESTARTS AGE CONTROLLER-REVISION-HASH VER
quiz-0 2/2 Running 0 53s quiz-6945968d9 0.2
quiz-1 2/2 Running 0 11m quiz-6c48bdd8df 0.1
quiz-2 2/2 Running 0 12m quiz-6c48bdd8df 0.1
To complete the rollout, you need to delete the remaining Pods. You can do this in the order that the workloads require, or in the order that you want.
Rolling back with the OnDelete strategy
Since the update strategy also applies when you use the kubectl rollout undo
command, the rollback process is also semi-automatic. You must delete each Pod yourself if you want to roll it back to the previous revision.
Updates with Pods that aren’t ready
Since you control the rollout and the controller replaces any Pod you delete, the Pod’s readiness status is irrelevant. If you delete a Pod that’s not ready, the controller updates it.
If you delete a Pod and the new Pod isn’t ready, but you still delete the next Pod, the controller will update that second Pod as well. It’s your responsibility to consider Pod readiness.