原理
HPA在k8s中也由一个controller控制,controller会间隔循环HPA,检查每个HPA中监控的指标是否触发伸缩条件,默认的间隔时间为15s。一旦触发伸缩条件,controller会向k8s发送请求,修改伸缩对象(statefulSet、replicaController、replicaSet)子对象scale中控制pod数量的字段。k8s响应请求,修改scale结构体,然后会刷新一次伸缩对象的pod数量。伸缩对象被修改后,自然会通过list/watch机制增加或减少pod数量,达到动态伸缩的目的
预先安装metrics-server
查看hpa配置
2.kubect get hpa
查看node配置
kubectl top node
二、进行压力测试
kubectl run -i --tty load-generator --image=busybox /bin/sh
while true; do wget -q -O- http://10.105.5.65; done
三、自定义HPA
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
annotations:
autoscaling.alpha.kubernetes.io/conditions: '[{"type":"AbleToScale","status":"True","lastTransitionTime":"2021-11-12T10:24:12Z","reason":"ReadyForNewScale","message":"recommended
size matches current size"},{"type":"ScalingActive","status":"True","lastTransitionTime":"2021-11-13T04:52:54Z","reason":"ValidMetricFound","message":"the
HPA was able to successfully calculate a replica count from cpu resource utilization
(percentage of request)"},{"type":"ScalingLimited","status":"True","lastTransitionTime":"2021-11-13T05:52:31Z","reason":"TooFewReplicas","message":"the
desired replica count is less than the minimum replica count"}]'
autoscaling.alpha.kubernetes.io/current-metrics: '[{"type":"Resource","resource":{"name":"cpu","currentAverageUtilization":0,"currentAverageValue":"1m"}}]'
creationTimestamp: "2021-11-12T10:23:56Z"
name: php-apache
namespace: default
resourceVersion: "57579"
uid: 5bdf76cb-fcfd-4761-91a4-24e08cf88e23
spec:
maxReplicas: 10
minReplicas: 1
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: php-apache
targetCPUUtilizationPercentage: 50
## 增加其他配置限制
status:
currentCPUUtilizationPercentage: 0
currentReplicas: 1
desiredReplicas: 1
lastScaleTime: "2021-11-13T05:50:00Z"
四HPA 最佳实践
- 为容器配置 CPU Requests
- HPA 目标设置恰当,如设置 70% 给容器和应用预留 30% 的余量
- 保持 Pods 和 Nodes 健康(避免 Pod 频繁重建)
- 保证用户请求的负载均衡
- 使用 kubectl top node 和 kubectl top pod 查看资源使用情况