1 污点
1.1 污点简介
亲和性调度的方式都是站在Pod的角度上,通过在Pod上增加属性来将Pod调度到到指定的节点上,其实也可以站在Node节点的角度上,通过给Node节点设置属性,来决定是否允许Pod调度过来,这就是污点。
Node被设置上污点之后就和Pod存在了一种相斥的关系,进而拒绝Pod调度进来,甚至可以将已经存在的Pod驱逐出去。
污点的格式为 key=value:effect,key和value是污点的标签,effect描述五点多额作用,支持如下三个选项
- PreferNoSchedule:Kubernetes将尽量避免把Pod调度到具有此污点的Node上,除非没有其他节点可调度了
- NoSchedule:Kubernetes将不会把Pod调度到具有该污点的Node上,但不会影响当前Node上已经存在的Pod
- NoExecute:Kubernetes将不会把Pod调度到具有此污点的Node上,同时也会将Node上已经存在的Pod驱逐
1.2 污点命令
# 设置污点 $ kubectl taint nodes node1 key=value:effect # 去除污点 $ kubectl taint nodes node1 key:effect- # 去除所有污点 $ kubectl taint nodes node1 key-
1.3 污点案例
1)给node1设置一个污点,尽量不要调度过来pod
[root@master resource_manage]# kubectl taint nodes node1 name=nginx:PreferNoSchedule node/node1 tainted
2)创建 nginx pod
[root@master resource_manage]# kubectl run nginx --image=nginx:1.17.1 --port=80 pod/nginx created
3)查询pod调度信息
[root@master resource_manage]# kubectl get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx 1/1 Running 0 7s 10.244.2.48 node2 <none> <none>
可以看到此时直接调度到node2了,不会调度到node1的,当然如果此时node2挂了,只有node1存活时,也会调度过来的。
1.4 查询节点污点
[root@master resource_manage]# kubectl describe node node1 Name: node1 Roles: <none> Labels: beta.kubernetes.io/arch=amd64 beta.kubernetes.io/os=linux kubernetes.io/arch=amd64 kubernetes.io/hostname=node1 kubernetes.io/os=linux nodeenv=test Annotations: flannel.alpha.coreos.com/backend-data: {"VNI":1,"VtepMAC":"ba:fe:1f:25:fe:26"} flannel.alpha.coreos.com/backend-type: vxlan flannel.alpha.coreos.com/kube-subnet-manager: true flannel.alpha.coreos.com/public-ip: 192.168.16.41 kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock node.alpha.kubernetes.io/ttl: 0 volumes.kubernetes.io/controller-managed-attach-detach: true CreationTimestamp: Mon, 14 Mar 2022 14:41:02 +0800 Taints: name=nginx:PreferNoSchedule Unschedulable: false Lease: HolderIdentity: node1 AcquireTime: <unset> RenewTime: Sat, 26 Mar 2022 00:00:54 +0800 Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message ---- ------ ----------------- ------------------ ------ ------- NetworkUnavailable False Mon, 14 Mar 2022 14:43:39 +0800 Mon, 14 Mar 2022 14:43:39 +0800 FlannelIsUp Flannel is running on this node MemoryPressure False Fri, 25 Mar 2022 23:58:57 +0800 Mon, 14 Mar 2022 14:41:02 +0800 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Fri, 25 Mar 2022 23:58:57 +0800 Mon, 14 Mar 2022 14:41:02 +0800 KubeletHasNoDiskPressure kubelet has no disk pressure PIDPressure False Fri, 25 Mar 2022 23:58:57 +0800 Mon, 14 Mar 2022 14:41:02 +0800 KubeletHasSufficientPID kubelet has sufficient PID available Ready True Fri, 25 Mar 2022 23:58:57 +0800 Mon, 14 Mar 2022 14:43:42 +0800 KubeletReady kubelet is posting ready status Addresses: InternalIP: 192.168.16.41 Hostname: node1 Capacity: cpu: 8 ephemeral-storage: 208357992Ki hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 32882960Ki pods: 110 Allocatable: cpu: 8 ephemeral-storage: 192022725110 hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 32780560Ki pods: 110 System Info: Machine ID: f9c2b25f57184e06b8855490b4be6013 System UUID: d1042642-3933-564f-4f2d-279b5e96cead Boot ID: 8517c1cc-8935-452e-9efb-a34f396b98a5 Kernel Version: 5.4.179-200.el7.x86_64 OS Image: CentOS Linux 7 (Core) Operating System: linux Architecture: amd64 Container Runtime Version: docker://20.10.9 Kubelet Version: v1.21.2 Kube-Proxy Version: v1.21.2 PodCIDR: 10.244.1.0/24 PodCIDRs: 10.244.1.0/24 Non-terminated Pods: (4 in total) Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age --------- ---- ------------ ---------- --------------- ------------- --- kube-system kube-flannel-ds-gg4jq 100m (1%) 100m (1%) 50Mi (0%) 50Mi (0%) 11d kube-system kube-proxy-tqzjl 0 (0%) 0 (0%) 0 (0%) 0 (0%) 11d kubernetes-dashboard dashboard-metrics-scraper-c45b7869d-7ll25 0 (0%) 0 (0%) 0 (0%) 0 (0%) 11d kubernetes-dashboard kubernetes-dashboard-79b5779bf4-t28b4 0 (0%) 0 (0%) 0 (0%) 0 (0%) 11d Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) Resource Requests Limits -------- -------- ------ cpu 100m (1%) 100m (1%) memory 50Mi (0%) 50Mi (0%) ephemeral-storage 0 (0%) 0 (0%) hugepages-1Gi 0 (0%) 0 (0%) hugepages-2Mi 0 (0%) 0 (0%) Events: <none>
1.5 删除污点
$ kubectl taint nodes node1 name:PreferNoSchedule- node/node1 untainted
1.6 为什么创建Pod的时候不会调度到master节点?
通过如下命令可以看到master节点是默认设置了node-role.kubernetes.io/master:NoSchedule类型的污点,因此在创建pod的时候是不会往master节点调度的。
[root@master resource_manage]# kubectl describe nodes master Name: master Roles: control-plane,master Labels: beta.kubernetes.io/arch=amd64 beta.kubernetes.io/os=linux kubernetes.io/arch=amd64 kubernetes.io/hostname=master kubernetes.io/os=linux node-role.kubernetes.io/control-plane= node-role.kubernetes.io/master= node.kubernetes.io/exclude-from-external-load-balancers= Annotations: flannel.alpha.coreos.com/backend-data: {"VNI":1,"VtepMAC":"02:f6:8e:03:60:51"} flannel.alpha.coreos.com/backend-type: vxlan flannel.alpha.coreos.com/kube-subnet-manager: true flannel.alpha.coreos.com/public-ip: 192.168.16.40 kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock node.alpha.kubernetes.io/ttl: 0 volumes.kubernetes.io/controller-managed-attach-detach: true CreationTimestamp: Mon, 14 Mar 2022 14:38:03 +0800 Taints: node-role.kubernetes.io/master:NoSchedule Unschedulable: false Lease: HolderIdentity: master AcquireTime: <unset> RenewTime: Sat, 26 Mar 2022 00:05:31 +0800 Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message ---- ------ ----------------- ------------------ ------ ------- NetworkUnavailable False Mon, 14 Mar 2022 14:42:58 +0800 Mon, 14 Mar 2022 14:42:58 +0800 FlannelIsUp Flannel is running on this node MemoryPressure False Sat, 26 Mar 2022 00:01:28 +0800 Mon, 14 Mar 2022 14:38:02 +0800 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Sat, 26 Mar 2022 00:01:28 +0800 Mon, 14 Mar 2022 14:38:02 +0800 KubeletHasNoDiskPressure kubelet has no disk pressure PIDPressure False Sat, 26 Mar 2022 00:01:28 +0800 Mon, 14 Mar 2022 14:38:02 +0800 KubeletHasSufficientPID kubelet has sufficient PID available Ready True Sat, 26 Mar 2022 00:01:28 +0800 Mon, 14 Mar 2022 14:43:03 +0800 KubeletReady kubelet is posting ready status Addresses: InternalIP: 192.168.16.40 Hostname: master Capacity: cpu: 8 ephemeral-storage: 208357992Ki hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 32882960Ki pods: 110 Allocatable: cpu: 8 ephemeral-storage: 192022725110 hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 32780560Ki pods: 110 System Info: Machine ID: f9c2b25f57184e06b8855490b4be6013 System UUID: c5d32642-f84c-61ef-ac7f-d65ae6880a51 Boot ID: 9cbc9b25-2cf2-42d8-aa89-1fdab687c447 Kernel Version: 5.4.179-200.el7.x86_64 OS Image: CentOS Linux 7 (Core) Operating System: linux Architecture: amd64 Container Runtime Version: docker://20.10.9 Kubelet Version: v1.21.2 Kube-Proxy Version: v1.21.2 PodCIDR: 10.244.0.0/24 PodCIDRs: 10.244.0.0/24 Non-terminated Pods: (6 in total) Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age --------- ---- ------------ ---------- --------------- ------------- --- kube-system etcd-master 100m (1%) 0 (0%) 100Mi (0%) 0 (0%) 11d kube-system kube-apiserver-master 250m (3%) 0 (0%) 0 (0%) 0 (0%) 11d kube-system kube-controller-manager-master 200m (2%) 0 (0%) 0 (0%) 0 (0%) 11d kube-system kube-flannel-ds-n76xj 100m (1%) 100m (1%) 50Mi (0%) 50Mi (0%) 11d kube-system kube-proxy-h27ms 0 (0%) 0 (0%) 0 (0%) 0 (0%) 11d kube-system kube-scheduler-master 100m (1%) 0 (0%) 0 (0%) 0 (0%) 11d Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) Resource Requests Limits -------- -------- ------ cpu 750m (9%) 100m (1%) memory 150Mi (0%) 50Mi (0%) ephemeral-storage 0 (0%) 0 (0%) hugepages-1Gi 0 (0%) 0 (0%) hugepages-2Mi 0 (0%) 0 (0%) Events: <none>
2 容忍
2.1 容忍简介
当对一个node节点定义了污点,但是又希望某一些pod是可以调度到带有污点的节点上,此时就需要容忍了,污点就是拒绝,容忍就是忽略/允许,Node通过污点拒绝Pod调度上去,Pod通过容忍忽略拒绝,如下:
2.2 容忍实战
1)给node1设置NoSchedule污点
此时为演示,可以先保持只有node1一个节点,将其他节点关闭
[root@master resource_manage]# kubectl taint nodes node1 name=nginx:NoSchedule node/node1 tainted
2)编辑带有容忍的pod_toleration.yaml文件
apiVersion: v1 kind: Namespace metadata: name: dev --- apiVersion: v1 kind: Pod metadata: name: nginx-pod namespace: dev spec: containers: - name: nginx image: nginx:1.17.1 tolerations: - key: "name" operator: "Equal" value: "nginx" effect: "NoSchedule"
3)创建资源
[root@master resource_manage]# kubectl apply -f pod_toleration.yaml namespace/dev created pod/nginx-pod created
4)查看验证
然后通过如下命令查看,可以发现此时还是可以调度到node1节点上的
[root@master resource_manage]# kubectl get pod -n dev -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx-pod 1/1 Running 0 13s 10.244.2.49 node1 <none> <none>
2.3 容忍配置项说明
通过如下命令可以查看配置项的说明:
[root@master resource_manage]# kubectl explain pod.spec.tolerations KIND: Pod VERSION: v1 RESOURCE: tolerations <[]Object> DESCRIPTION: If specified, the pod's tolerations. The pod this Toleration is attached to tolerates any taint that matches the triple <key,value,effect> using the matching operator <operator>. FIELDS: effect <string> Effect indicates the taint effect to match. Empty means match all taint effects. When specified, allowed values are NoSchedule, PreferNoSchedule and NoExecute. key <string> Key is the taint key that the toleration applies to. Empty means match all taint keys. If the key is empty, operator must be Exists; this combination means to match all values and all keys. operator <string> Operator represents a key's relationship to the value. Valid operators are Exists and Equal. Defaults to Equal. Exists is equivalent to wildcard for value, so that a pod can tolerate all taints of a particular category. tolerationSeconds <integer> TolerationSeconds represents the period of time the toleration (which must be of effect NoExecute, otherwise this field is ignored) tolerates the taint. By default, it is not set, which means tolerate the taint forever (do not evict). Zero and negative values will be treated as 0 (evict immediately) by the system. value <string> Value is the taint value the toleration matches to. If the operator is Exists, the value should be empty, otherwise just a regular string.