6.2 Keeping containers healthy
The pods you created in the previous chapter ran without any problems. But what if one of the containers dies? What if all the containers in a pod die? How do you keep the pods healthy and their containers running? That’s the focus of this section.
Understanding container auto-restart
When a pod is scheduled to a node, the Kubelet on that node starts its containers and from then on keeps them running for as long as the pod object exists. If the main process in the container terminates for any reason, the Kubelet restarts the container. If an error in your application causes it to crash, Kubernetes automatically restarts it, so even without doing anything special in the application itself, running it in Kubernetes automatically gives it the ability to heal itself. Let’s see this in action.
Observing a container failure
In the previous chapter, you created the kubia-ssl
pod, which contains the Node.js and the Envoy containers. Create the pod again and enable communication with the pod by running the following two commands:
$ kubectl apply -f kubia-ssl.yaml
$ kubectl port-forward kubia-ssl 8080 8443 9901
You’ll now cause the Envoy container to terminate to see how Kubernetes deals with the situation. Run the following command in a separate terminal so you can see how the pod’s status changes when one of its containers terminates:
$ kubectl get pods -w
You’ll also want to watch events in another terminal using the following command:
$ kubectl get events -w
You could emulate a crash of the container’s main process by sending it the KILL
signal, but you can’t do this from inside the container because the Linux Kernel doesn’t let you kill the root process (the process with PID 1). You would have to SSH to the pod’s host node and kill the process from there. Fortunately, Envoy’s administration interface allows you to stop the process via its HTTP API.
To terminate the envoy
container, open the URL http://localhost:9901 in your browser and click the quitquitquit button or run the following curl
command in another terminal:
$ curl -X POST http://localhost:9901/quitquitquit
OK
To see what happens with the container and the pod it belongs to, examine the output of the kubectl get pods -w
command you ran earlier. It’s shown in the next listing.
Listing 6.4 Pod state transitions when a container terminates
$ kubectl get po -w
NAME READY STATUS RESTARTS AGE
kubia-ssl 2/2 Running 0 1s
kubia-ssl 1/2 NotReady 0 9m33s
kubia-ssl 2/2 Running 1 9m34s
The listing shows that the pod’s STATUS
changes from Running
to NotReady
, while the READY
column indicates that only one of the two containers is ready. Immediately thereafter, Kubernetes restarts the container and the pod’s status returns to Running
. The RESTARTS
column indicates that one container has been restarted.
NOTE
If one of the pod’s containers fails, the other containers continue to run.
Now examine the output of the kubectl get events -w
command you ran earlier. It is shown in the next listing.
Listing 6.6 Events emitted when a container terminates
$ kubectl get ev -w
LAST SEEN TYPE REASON OBJECT MESSAGE
0s Normal Pulled pod/kubia-ssl Container image already
present on machine
0s Normal Created pod/kubia-ssl Created container envoy
0s Normal Started pod/kubia-ssl Started container envoy
The events show that the new envoy
container has been started. You should be able to access the application via HTTPS again. Please confirm with your browser or curl
.
The events in the listing also expose an important detail about how Kubernetes restarts containers. The second event indicates that the entire envoy
container has been recreated. Kubernetes never restarts a container, but instead discards it and creates a new container. Regardless, we call this restarting a container.
NOTE
Any data that the process writes to the container’s filesystem is lost when the container is recreated. This behavior is sometimes undesirable. To persist data, you must add a storage volume to the pod, as explained in the next chapter.
NOTE
If init containers are defined in the pod and one of the pod’s regular containers is restarted, the init containers are not executed again.
Configuring the pod’s restart policy
By default, Kubernetes restarts the container regardless of whether the process in the container exits with a zero or non-zero exit code - in other words, whether the container completes successfully or fails. This behavior can be changed by setting the restartPolicy
field in the pod’s spec
.
Three restart policies exist. They are explained in the following figure.
Figure 6.4 The pod’s restartPolicy determines whether its containers are restarted or not
The following table describes the three restart policies.
|---|---| | Restart Policy | Description | | Always | Container is restarted regardless of the exit code the process in the container terminates with. This is the default restart policy. | | OnFailure | The container is restarted only if the process terminates with a non-zero exit code, which by convention indicates failure. | | Never | The container is never restarted - not even when it fails. |
Table 6.4 Pod restart policies
NOTE
Surprisingly, the restart policy is configured at the pod level and applies to all its containers. It can’t be configured for each container individually.
Understanding the time delay inserted before a container is restarted
If you call Envoy’s /quitquitquit
endpoint several times, you’ll notice that each time it takes longer to restart the container after it terminates. The pod’s status is displayed as either NotReady
or CrashLoopBackOff
. Here’s what it means.
As shown in the following figure, the first time a container terminates, it is restarted immediately. The next time, however, Kubernetes waits ten seconds before restarting it again. This delay is then doubled to 20, 40, 80 and then to 160 seconds after each subsequent termination. From then on, the delay is kept at five minutes. This delay that doubles between attempts is called exponential back-off.
Figure 6.5 Exponential back-off between container restarts
In the worst case, a container can therefore be prevented from starting for up to five minutes.
NOTE
The delay is reset to zero when the container has run successfully for 10 minutes. If the container must be restarted later, it is restarted immediately.
As you can see in the following listing, the container is in the Waiting
state while it waits to be restarted, and the reason
is shown as CrashLoopBackOff
. The message
field indicates how long it will take for the container to be restarted.
Listing 6.7 The state of a container that’s waiting to be restarted
$ kubectl get po kubia-ssl -o json | jq .status.containerStatuses
...
"state": {
"waiting": {
"message": "back-off 40s restarting failed container=envoy ...",
"reason": "CrashLoopBackOff"
NOTE
When you tell Envoy to terminate, it terminates with exit code zero, which means it hasn’t crashed. The CrashLoopBackOff
status can therefore be misleading.
Checking the container’s health using liveness probes
In the previous section, you learned that Kubernetes keeps your application healthy by restarting it when its process terminates. But applications can also become unresponsive without terminating. For example, a Java application with a memory leak eventually starts spewing out OutOfMemoryErrors, but its JVM process continues to run. Ideally, Kubernetes should detect this kind of error and restart the container.
The application could catch these errors by itself and immediately terminate, but what about the situations where your application stops responding because it gets into an infinite loop or deadlock? What if the application can’t detect this? To ensure that the application is restarted in such cases, it may be necessary to check its state from the outside.
Introducing liveness probes
Kubernetes can be configured to check whether an application is still alive by defining a liveness probe. You can specify a liveness probe for each container in the pod. Kubernetes runs the probe periodically to ask the application if it’s still alive and well. If the application doesn’t respond, an error occurs, or the response is negative, the container is considered unhealthy and is terminated. The container is then restarted if the restart policy allows it.
NOTE
Liveness probes can only be used in the pod’s regular containers. They can’t be defined in init containers.
Types of liveness probes
Kubernetes can probe a container with one of the following three mechanisms:
- An HTTP GET probe sends a GET request to the container’s IP address, on the network port and path you specify. If the probe receives a response, and the response code doesn’t represent an error (in other words, if the HTTP response code is
2xx
or3xx
), the probe is considered successful. If the server returns an error response code, or if it doesn’t respond in time, the probe is considered to have failed. - A TCP Socket probe attempts to open a TCP connection to the specified port of the container. If the connection is successfully established, the probe is considered successful. If the connection can’t be established in time, the probe is considered failed.
- An Exec probe executes a command inside the container and checks the exit code it terminates with. If the exit code is zero, the probe is successful. A non-zero exit code is considered a failure. The probe is also considered to have failed if the command fails to terminate in time.
NOTE
In addition to a liveness probe, a container may also have a startup probe, which is discussed in section 6.2.6, and a readiness probe, which is explained in chapter 10.
Creating an HTTP GET liveness probe
Let’s look at how to add a liveness probe to each of the containers in the kubia-ssl
pod. Because they both run applications that understand HTTP, it makes sense to use an HTTP GET probe in each of them. The Node.js application doesn’t provide any endpoints to explicitly check the health of the application, but the Envoy proxy does. In real-world applications, you’ll encounter both cases.
Defining liveness probes in the pod manifest
The following listing shows an updated manifest for the pod, which defines a liveness probe for each of the two containers, with different levels of configuration.
Listing 6.8 Adding a liveness probe to a pod: kubia-liveness.yaml
apiVersion: v1
kind: Pod
metadata:
name: kubia-liveness
spec:
containers:
- name: kubia
image: luksa/kubia:1.0
ports:
- name: http
containerPort: 8080
livenessProbe: #A
httpGet: #A
path: / #A
port: 8080 #A
- name: envoy
image: luksa/kubia-ssl-proxy:1.0
ports:
- name: https
containerPort: 8443
- name: admin
containerPort: 9901
livenessProbe: #B
httpGet: #B
path: /ready #B
port: admin #B
initialDelaySeconds: 10 #B
periodSeconds: 5 #B
timeoutSeconds: 2 #B
failureThreshold: 3 #B
#A The liveness probe definition for the container running Node.js
#B The liveness probe for the Envoy proxy
These liveness probes are explained in the next two sections.
Defining a liveness probe using the minimum required configuration
The liveness probe for the kubia
container is the simplest version of a probe for HTTP-based applications. The probe simply sends an HTTP GET
request for the path /
on port 8080
to determine if the container can still serve requests. If the application responds with an HTTP status between 200
and 399
, the application is considered healthy.
The probe doesn’t specify any other fields, so the default settings are used. The first request is sent 10s after the container starts and is repeated every 10s. If the application doesn’t respond within one second, the probe attempt is considered failed. If it fails three times in a row, the container is considered unhealthy and is terminated.
Understanding liveness probe configuration options
The administration interface of the Envoy proxy provides the special endpoint /ready
through which it exposes its health status. Instead of targeting port 8443
, which is the port through which Envoy forwards HTTPS requests to Node.js, the liveness probe for the envoy
container targets this special endpoint on the admin
port, which is port number 9901
.
NOTE
As you can see in the envoy
container’s liveness probe, you can specify the probe’s target port by name instead of by number.
The liveness probe for the envoy
container also contains additional fields. These are best explained with the following figure.
Figure 6.6 The configuration and operation of a liveness probe
The parameter initialDelaySeconds
determines how long Kubernetes should delay the execution of the first probe after starting the container. The periodSeconds
field specifies the amount of time between the execution of two consecutive probes, whereas the timeoutSeconds
field specifies how long to wait for a response before the probe attempt counts as failed. The failureThreshold
field specifies how many times the probe must fail for the container to be considered unhealthy and potentially restarted.
Observing the liveness probe in action
To see Kubernetes restart a container when its liveness probe fails, create the pod from the kubia-liveness.yaml
manifest file using kubectl apply
, and run kubectl port-forward
to enable communication with the pod. You’ll need to stop the kubectl port-forward
command still running from the previous exercise. Confirm that the pod is running and is responding to HTTP requests.
Observing a successful liveness probe
The liveness probes for the pod’s containers starts firing soon after the start of each individual container. Since the processes in both containers are healthy, the probes continuously report success. As this is the normal state, the fact that the probes are successful is not explicitly indicated anywhere in the status of the pod nor in its events.
The only indication that Kubernetes is executing the probe is found in the container logs. The Node.js application in the kubia
container prints a line to the standard output every time it handles an HTTP request. This includes the liveness probe requests, so you can display them using the following command:
$ kubectl logs kubia-liveness -c kubia -f
The liveness probe for the envoy
container is configured to send HTTP requests to Envoy’s administration interface, which doesn’t log HTTP requests to the standard output, but to the file /var/log/envoy.admin.log
in the container’s filesystem. To display the log file, you use the following command:
$ kubectl exec kubia-liveness -c envoy -- tail -f /var/log/envoy.admin.log
Observing the liveness probe fail
A successful liveness probe isn’t interesting, so let’s cause Envoy’s liveness probe to fail. To see what will happen behind the scenes, start watching events by executing the following command in a separate terminal:
$ kubectl get events -w
Using Envoy’s administration interface, you can configure its health check endpoint to succeed or fail. To make it fail, open URL http://localhost:9901 in your browser and click the healthcheck/fail button, or use the following curl
command:
$ curl -X POST localhost:9901/healthcheck/fail
Immediately after executing the command, observe the events that are displayed in the other terminal. When the probe fails, a Warning
event is recorded, indicating the error and the HTTP status code returned:
Warning Unhealthy Liveness probe failed: HTTP probe failed with code 503
Because the probe’s failureThreshold
is set to three, a single failure is not enough to consider the container unhealthy, so it continues to run. You can make the liveness probe succeed again by clicking the healthcheck/ok button in Envoy’s admin interface, or by using curl
as follows:
$ curl -X POST localhost:9901/healthcheck/ok
If you are fast enough, the container won’t be restarted.
Observing the liveness probe reach the failure threshold
If you let the liveness probe fail multiple times, you should see events like the ones in the next listing (note that some columns are omitted due to page width constraints).
Listing 6.9 Events recorded when a liveness probe fails
$ kubectl get events -w
TYPE REASON MESSAGE
Warning Unhealthy Liveness probe failed: HTTP probe failed with code 503
Warning Unhealthy Liveness probe failed: HTTP probe failed with code 503
Warning Unhealthy Liveness probe failed: HTTP probe failed with code 503
Normal Killing Container envoy failed liveness probe, will be
restarted
Normal Pulled Container image already present on machine
Normal Created Created container envoy
Normal Started Started container envoy
Remember that the probe failure threshold is set to three, so when the probe fails three times in a row, the container is stopped and restarted. This is indicated by the events in the listing.
The kubectl get pods
command shows that the container has been restarted:
$ kubectl get po kubia-liveness
NAME READY STATUS RESTARTS AGE
kubia-liveness 2/2 Running 1 5m
The RESTARTS
column shows that one container restart has taken place in the pod.
Understanding how a container that fails its liveness probe is restarted
If you’re wondering whether the main process in the container was gracefully stopped or killed forcibly, you can check the pod’s status by retrieving the full manifest using kubectl get
or using kubectl
describe
as shown in the following listing.
Listing 6.10 Inspecting the restarted container’s last state with kubectl describe
$ kubectl describe po kubia-liveness
Name: kubia-liveness
...
Containers:
...
envoy:
...
State: Running #A
Started: Sun, 31 May 2020 21:33:13 +0200 #A
Last State: Terminated #B
Reason: Completed #B
Exit Code: 0 #B
Started: Sun, 31 May 2020 21:16:43 +0200 #B
Finished: Sun, 31 May 2020 21:33:13 +0200 #B
...
#A This is the state of the new container.
#B The previous container terminated with exit code 0.
The exit code zero shown in the listing implies that the application process gracefully exited on its own. If it had been killed, the exit code would have been 137.
NOTE
Exit code 128+n
indicates that the process exited due to external signal n
. Exit code 137
is 128+9
, where 9
represents the KILL
signal. You’ll see this exit code whenever the container is killed. Exit code 143
is 128+15
, where 15
is the TERM
signal. You’ll typically see this exit code when the container runs a shell that has terminated gracefully.
Let’s examine Envoy’s log to confirm that it caught the TERM
signal and has terminated by itself. You must use the kubectl
logs
command with the --container
or the shorter -c
option to specify what container you’re interested in.
Also, because the container has been replaced with a new one due to the restart, you must request the log of the previous container using the --previous
or -p
flag. The next listing shows the full command and the last four lines of its output.
Listing 6.11 The last few lines of Envoy’s log when killed due to a failed liveness probe
$ kubectl logs kubia-liveness -c envoy -p
...
...[warning][main] [source/server/server.cc:493] caught SIGTERM
...[info][main] [source/server/server.cc:613] shutting down server instance
...[info][main] [source/server/server.cc:560] main dispatch loop exited
...[info][main] [source/server/server.cc:606] exiting
The log confirms that Kubernetes sent the TERM
signal to the process, allowing it to shut down gracefully. Had it not terminated by itself, Kubernetes would have killed it forcibly.
After the container is restarted, its health check endpoint responds with HTTP status 200 OK
again, indicating that the container is healthy.
Using the exec and the tcpSocket liveness probe types
For applications that don’t expose HTTP health-check endpoints, the tcpSocket
or the exec
liveness probes should be used.
Adding a tcpSocket liveness probe
For applications that accept non-HTTP TCP connections, a tcpSocket
liveness probe can be configured. Kubernetes tries to open a socket to the TCP port and if the connection is established, the probe is considered a success, otherwise it's considered a failure.
An example of a tcpSocket
liveness probe is shown in the following listing.
Listing 6.12 An example of a tcpSocket liveness probe
livenessProbe:
tcpSocket: #A
port: 1234 #A
periodSeconds: 2 #B
failureThreshold: 1 #C
#A This tcpSocket probe uses TCP port 1234
#B The probe runs every 2s
#C A single probe failure is enough to restart the container
The probe in the listing is configured to check if the container’s network port 1234
is open. An attempt to establish a connection is made every two seconds and a single failed attempt is enough to consider the container as unhealthy.
Adding an exec liveness probe
Applications that do not accept TCP connections may provide a command to check their status. For these applications, an exec
liveness probe is used. As shown in the next figure, the command is executed inside the container and must therefore be available on the container’s file system.
Figure 6.7 The exec liveness probe runs the command inside the container
The following listing shows an example of a probe that runs /usr/bin/healthcheck
every two seconds to determine if the application running in the container is still alive.
Listing 6.13 An example of an exec liveness probe
livenessProbe:
exec:
command: #A
- /usr/bin/healthcheck #A
periodSeconds: 2 #B
timeoutSeconds: 1 #C
failureThreshold: 1 #D
#A The command to run and its arguments
#B The probe runs every second
#C The command must return within one second
#D A single probe failure is enough to restart the container
If the command returns exit code zero, the container is considered healthy. If it returns a non-zero exit code or fails to complete within one second as specified in the timeoutSeconds
field, the container is terminated immediately, as configured in the failureThreshold
field, which indicates that a single probe failure is sufficient to consider the container as unhealthy.
Using a startup probe when an application is slow to start
The default liveness probe settings give the application between 20 and 30 seconds to start responding to liveness probe requests. If the application takes longer to start, it is restarted and must start again. If the second start also takes as long, it is restarted again. If this continues, the container never reaches the state where the liveness probe succeeds and gets stuck in an endless restart loop.
To prevent this, you can increase the initialDelaySeconds
, periodSeconds
or failureThreshold
settings to account for the long start time, but this will have a negative effect on the normal operation of the application. The higher the result of periodSeconds
* failureThreshold
, the longer it takes to restart the application if it becomes unhealthy. For applications that take minutes to start, increasing these parameters enough to prevent the application from being restarted prematurely may not be a viable option.
Introducing startup probes
To deal with the discrepancy between the start and the steady-state operation of an application, Kubernetes also provides startup probes.
If a startup probe is defined for a container, only the startup probe is executed when the container is started. The startup probe can be configured to take into account the slow start of the application. When the startup probe succeeds, Kubernetes switches to using the liveness probe, which is configured to quickly detect when the application becomes unhealthy.
Adding a startup probe to a pod’s manifest
Imagine that the kubia Node.js application needs more than a minute to warm up, but you want it to be restarted within 10 seconds after it has become unhealthy during normal operation. The following listing shows how you’d configure the startup and the liveness probes.
Listing 6.14 Using a combination of startup and liveness probes
containers:
- name: kubia
image: luksa/kubia:1.0
ports:
- name: http
containerPort: 8080
startupProbe:
httpGet:
path: / #A
port: http #A
periodSeconds: 10 #B
failureThreshold: 12 #B
livenessProbe:
httpGet:
path: / #A
port: http #A
periodSeconds: 5 #C
failureThreshold: 2 #C
#A The startup and the liveness probes typically use the same endpoint
#B The application gets 120 seconds to start
#C After startup, the application’s health is checked every 5 seconds, and is restarted when it fails the liveness probe twice
When the container defined in the listing starts, the application has 120 seconds to start responding to requests. Kubernetes performs the startup probe every 10 seconds and makes a maximum of 12 attempts.
As shown in the following figure, unlike liveness probes, it’s perfectly normal for a startup probe to fail. A failure only indicates that the application hasn’t yet been completely started. A successful startup probe indicates that the application has started successfully, and Kubernetes should switch to the liveness probe. The liveness probe is then typically executed using a shorter period of time, which allows for faster detection of non-responsive applications.
Figure 6.8 Fast detection of application health problems using a combination of startup and liveness probe
NOTE
If the startup probe fails often enough to reach the failureThreshold
, the container is terminated as it the liveness probe had failed.
Usually, the startup and liveness probes are configured to use the same HTTP endpoint, but different endpoints can be used. You can also configure the startup probe as an exec
or tcpSocket
probe instead of an httpGet probe.
Creating effective liveness probe handlers
You should define a liveness probe for all your pods. Without one, Kubernetes has no way of knowing whether your app is still alive or not, apart from checking whether the application process has terminated.
Causing unnecessary restarts with badly implemented liveness probe handlers
When you implement a handler for the liveness probe, either as an HTTP endpoint in your application or as an additional executable command, be very careful to implement it correctly. If a poorly implemented probe returns a negative response even though the application is healthy, the application will be restarted unnecessarily. Many Kubernetes users learn this the hard way. If you can make sure that the application process terminates by itself when it becomes unhealthy, it may be safer not to define a liveness probe.
What a liveness probe should check
The liveness probe for the kubia
container isn’t configured to call an actual health-check endpoint, but only checks that the Node.js server responds to simple HTTP requests for the root URI. This may seem overly simple, but even such a liveness probe works wonders, because it causes a restart of the container if the server no longer responds to HTTP requests, which is its main task. If no liveness probe was defined, the pod would remain in an unhealthy state where it doesn’t respond to any requests and would have to be restarted manually. A simple liveness probe like this is better than nothing.
To provide a better liveness check, web applications typically expose a specific health-check endpoint, such as /healthz
. When this endpoint is called, the application performs an internal status check of all the major components running within the application to ensure that none of them have died or are no longer doing what they should.
TIP
Make sure that the /healthz
HTTP endpoint doesn’t require authentication; otherwise the probe will always fail, causing your container to be restarted indefinitely.
Make sure that the application checks only the operation of its internal components and nothing that is influenced by an external factor. For example, the health-check endpoint of a frontend service should never respond with failure when it can’t connect to a backend service. If the backend service fails, restarting the frontend will not solve the problem. Such a liveness probe will fail again after the restart, so the container will be restarted repeatedly until the backend is repaired. If many services are interdependent in this way, the failure of a single service can result in cascading failures across the entire system.
Keeping probes light
The handler invoked by a liveness probe shouldn’t use too much computing resources and shouldn’t take too long to complete. By default, probes are executed relatively often and only given one second to complete.
Using a handler that consumes a lot of CPU or memory can seriously affect the main process of your container. Later in the book you’ll learn how to limit the CPU time and total memory available to a container. The CPU and memory consumed by the probe handler invocation count towards the resource quota of the container, so using a resource-intensive handler will reduce the CPU time available to the main process of the application.
TIP
When running a Java application in your container, you may want to use an HTTP GET probe instead of an exec liveness probe that starts an entire JVM. The same applies to commands that require considerable computing resources.
Avoiding retry loops in your probe handlers
You’ve learned that the failure threshold for the probe is configurable. Instead of implementing a retry loop in your probe handlers, keep it simple and instead set the failureThreshold
field to a higher value so that the probe must fail several times before the application is considered unhealthy. Implementing your own retry mechanism in the handler is a waste of effort and represents another potential point of failure.