资源限制
- spec.containers[].resources.limits.cpu : CPU 上限, 可以短暂超过, 容器也不会被停止
- spec.containers[].resources.limits.memory : 内存上限, 不可以超过; 如果超过, 容器可能会被终止或调度到其他资源充足的机器上
- spec.containers[].resources.limits.ephemeral-storage : 临时存储( 容器可写层、 日志以及 EmptyDir 等) 的上限, 超过后 Pod 会被驱逐
- spec.containers[].resources.requests.cpu : CPU 请求, 也是调度 CPU 资源的依据, 可以超过
- spec.containers[].resources.requests.memory : 内存请求, 也是调度内存资源的依据, 可以超过; 但如果超过, 容器可能会在 Node 内存不足时清理
- spec.containers[].resources.requests.ephemeral-storage : 临时存储( 容器可写层、 日志以及 EmptyDir 等) 的请求, 调度容器存储的依据
健康检查
liveness probe(存活探针):来确定何时重启容器。例如,当应用程序处于运行状态但无法做进一步操作,liveness 探针将捕获到 deadlock,重启处于该状态下的容器,使应用程序在存在 bug 的情况下依然能够继续运行下去。
readiness probe(就绪探针):来确定容器是否已经就绪可以接受流量。只有当 Pod 中的容器都处于就绪状态时 kubelet 才会认定该 Pod处于就绪状态。该信号的作用是控制哪些 Pod应该作为service的后端。如果 Pod 处于非就绪状态,那么它们将会被从 service 的load balancer(负载均衡)中移除。
startupProbe(启动探测) : 指示容器中的应用是否已经启动。如果提供了启动探测(startup probe),则禁用所有其他探测,直到它成功为止。如果启动探测失败,kubelet 将杀死容器,容器服从其重启策略进行重启。如果容器没有提供启动探测,则默认状态为成功Success。
存活探针
1.定义 liveness exec请求
apiVersion: v1
kind: Pod
metadata:
labels:
test: liveness
name: liveness-exec
spec:
containers:
- name: liveness
image: busybox
args:
- /bin/sh
- -c
- touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600
livenessProbe:
exec:
command: # kubelet 在容器内执行命令 cat /tmp/healthy 来进行探测。 如果命令执行成功并且返回值为 0,kubelet 就会认为这个容器是健康存活的
- cat
- /tmp/healthy
initialDelaySeconds: 5
periodSeconds: 5
~
调度失败后会进行容器重启
# kubectl describe pod liveness-exec
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m2s default-scheduler Successfully assigned default/liveness-exec to k8s-node1
Normal Pulled 104s kubelet Successfully pulled image "busybox" in 15.638033484s
Warning Unhealthy 62s (x3 over 72s) kubelet Liveness probe failed: cat: can't open '/tmp/healthy': No such file or directory
Normal Killing 62s kubelet Container liveness failed liveness probe, will be restarted
Normal Pulling 32s (x2 over 119s) kubelet Pulling image "busybox"
Normal Created 29s (x2 over 104s) kubelet Created container liveness
Normal Started 29s (x2 over 103s) kubelet Started container liveness
Normal Pulled 29s kubelet Successfully pulled image "busybox" in 2.736225925s
2.定义 liveness HTTP请求
apiVersion: v1
kind: Pod
metadata:
labels:
test: liveness
name: liveness-http
spec:
containers:
- name: liveness
image: fkconsultin/liveness-testing
args:
- /server
livenessProbe:
httpGet:
path: /healthz
port: 8080
httpHeaders:
- name: X-Custom-Header
value: Awesome
initialDelaySeconds: 3
periodSeconds: 3
HTTP liveness探测失败,进行容器重启
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m47s default-scheduler Successfully assigned default/liveness-http to k8s-node1
Normal Pulled 117s kubelet Successfully pulled image "fkconsultin/liveness-testing" in 48.941841972s
Normal Pulled 61s kubelet Successfully pulled image "fkconsultin/liveness-testing" in 40.836616911s
Normal Created 36s (x3 over 117s) kubelet Created container liveness
Normal Started 36s (x3 over 117s) kubelet Started container liveness
Normal Pulled 36s kubelet Successfully pulled image "fkconsultin/liveness-testing" in 8.912067487s
Warning Unhealthy 26s (x9 over 113s) kubelet Liveness probe failed: HTTP probe failed with statuscode: 404
Normal Killing 26s (x3 over 107s) kubelet Container liveness failed liveness probe, will be restarted
Normal Pulling 21s (x4 over 2m46s) kubelet Pulling image "fkconsultin/liveness-testing"
3.定义 liveness TCP请求
apiVersion: v1
kind: Pod
metadata:
name: nginx
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14-alpine
ports:
- containerPort: 8080
readinessProbe:
tcpSocket:
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
tcpSocket: # 指定tcp端口是否通
port: 8080
initialDelaySeconds: 15
periodSeconds: 20
15 秒之后,通过看 Pod 事件来检测存活探测器:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 87s default-scheduler Successfully assigned default/nginx to k8s-node1
Normal Killing 28s kubelet Container nginx failed liveness probe, will be restarted
Normal Pulled 27s (x2 over 86s) kubelet Container image "nginx:1.14-alpine" already present on machine
Normal Created 27s (x2 over 86s) kubelet Created container nginx
Normal Started 27s (x2 over 86s) kubelet Started container nginx
Warning Unhealthy 8s (x9 over 78s) kubelet Readiness probe failed: dial tcp 10.244.36.89:8080: connect: connection refused
Warning Unhealthy 8s (x4 over 68s) kubelet Liveness probe failed: dial tcp 10.244.36.89:8080: connect: connection refused
就绪探针
Readiness probe的配置跟liveness probe很像。唯一的不同是使用 readinessProbe而不是livenessProbe。
readinessProbe:
exec:
command:
- cat
- /tmp/healthy
initialDelaySeconds: 5
periodSeconds: 5
Configure Probes
Probe 中有很多精确和详细的配置,通过它们您能准确的控制 liveness 和 readiness 检查
- initialDelaySeconds:容器启动后第一次执行探测是需要等待多少秒。
- periodSeconds:执行探测的频率。默认是10秒,最小1秒。
- timeoutSeconds:探测超时时间。默认1秒,最小1秒。
- successThreshold:探测失败后,最少连续探测成功多少次才被认定为成功。默认是 1。对于 liveness 必须是 1。最小值是 1。
- failureThreshold:探测成功后,最少连续探测失败多少次才被认定为失败。默认是 3。最小值是 1。
HTTP probe 中可以给 httpGet设置其他配置项:
- host:连接的主机名,默认连接到 pod 的 IP。您可能想在 http header 中设置 “Host” 而不是使用 IP。
- scheme:连接使用的 schema,默认HTTP。
- path: 访问的HTTP server 的 path。
- httpHeaders:自定义请求的 header。HTTP运行重复的 header。
- port:访问的容器的端口名字或者端口号。端口号必须介于 1 和 65525 之间。
启动退出动作
apiVersion: v1
kind: Pod
metadata:
name: lifecycle-demo
spec:
containers:
- name: lifecycle-demo-container
image: nginx
lifecycle:
postStart:
exec:
command:
- "/bin/sh"
- "-c"
- "echo Hello from the postStart handler > /usr/share/message"
preStop:
exec:
command:
- "/bin/sh"
- "-c"
- "echo Hello from the poststop handler > /usr/share/message"