美文网首页
Pod调度(指定节点、污点、亲和性)

Pod调度(指定节点、污点、亲和性)

作者: 小李飞刀_lql | 来源:发表于2021-11-15 06:43 被阅读0次

创建一个Pod的工作流程

相关概念

001 Kubernetes基于list-watch机制的控制器架构,实现组件间交互的解耦
002 其他组件监控自己负责的资源,当这些资源发生变化时,kubeapiserver会通知这些组件,这个过程类似于发布与订阅

工作流程图

1636811426331.png

Pod中影响调度的主要属性

资源调度依据

resources: {}

调度策略

schedulerName: default-scheduler
nodeName: ""
nodeSelector: {}
affinity: {}
tolerations: []
1636856892421.png

资源限制对Pod调度的影响

相关参数

001 容器资源限制
• resources.limits.cpu
• resources.limits.memory
002 容器使用的最小资源需求,作为容器调度时资源分配的依据
• resources.requests.cpu
• resources.requests.memory
003 CPU单位:可以写m也可以写浮点数,例如0.5=500m,1=1000m
004 K8s会根据Request的值去查找有足够资源的Node来调度此Pod

示例1 定义一个无法分配的pod

apiVersion: v1
kind: Pod
metadata:
  name: pod-resource2 
spec:
  containers:
  - name: web
    image: nginx
    resources:
      requests: 
        memory: "4Gi"
        cpu: "2000m"
        
------------------------------------------------------------------------------
[root@k8smaster pod]# kubectl apply -f pod-resource2.yaml 
pod/pod-resource2 created

#无法分配 有两种可能
#001 没有合适的节点分配资源
#002 正在拉取镜像
[root@k8smaster pod]# kubectl get pod
NAME            READY   STATUS    RESTARTS   AGE
pod-resource2   0/1     Pending   0          56s

#查看无法分配的原因,没有足够的内存
[root@k8smaster pod]# kubectl describe pod pod-resource2
0/2 nodes are available: 2 Insufficient memory. 

#查看node节点的分配状况
[root@k8smaster pod]# kubectl describe node k8snode1 

#可以分配的资源
Allocatable:
  cpu:                4
  ephemeral-storage:  15258982785
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             2759772Ki
  pods:               110
#已经分配的资源
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests    Limits
  --------           --------    ------
  cpu                550m (13%)  100m (2%)
  memory             200Mi (7%)  400Mi (14%)
  ephemeral-storage  0 (0%)      0 (0%)
  hugepages-1Gi      0 (0%)      0 (0%)
  hugepages-2Mi      0 (0%)      0 (0%)  

示例2 资源限制

apiVersion: v1
kind: Pod
metadata:
  name: pod-resource 
spec:
  containers:
  - name: web
    image: nginx
    resources:
      requests:   # 容器最小资源配额
        memory: "64Mi"
        cpu: "250m"
      limits:     # 容器最大资源上限
        memory: "128Mi"
        cpu: "500m"
        
-----------------------------------------------------------------------------
[root@k8smaster pod]# kubectl apply -f pod-resource.yaml 
pod/pod-resource created
[root@k8smaster pod]# kubectl get pod
NAME           READY   STATUS    RESTARTS   AGE
pod-resource   1/1     Running   0          50s

nodeSelector & nodeAffinity

nodeSelector概述

001 用于将Pod调度到匹配Label的Node上,如果没有匹配的标签会调度失败
002 约束Pod到特定的节点运行
003 完全匹配节点标签

nodeSelector应用场景

001 专用节点:根据业务线将Node分组管理
002 配备特殊硬件:部分Node配有SSD硬盘、GPU

nodeSelector示例:确保Pod分配到具有SSD硬盘的节点上

给节点添加标签

格式:kubectl label nodes <node-name> <label-key>=<label-value>

[root@k8smaster pod]# kubectl label nodes k8snode1 disktype=ssd 
node/k8snode1 labeled

#验证
[root@k8smaster pod]# kubectl get nodes --show-labels |grep k8snode1
k8snode1    Ready    node     14d   v1.19.0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,disktype=ssd,kubernetes.io/arch=amd64,kubernetes.io/hostname=k8snode1,kubernetes.io/os=linux,node-role.kubernetes.io/node=

#yaml

apiVersion: v1
kind: Pod
metadata:
  name: pod-nodeselector
spec:
  nodeSelector:
    disktype: "ssd"
  containers:
  - name: web
    image: nginx
    
------------------------------------------------------------------
[root@k8smaster pod]# kubectl get pod
NAME               READY   STATUS    RESTARTS   AGE
pod-nodeselector   1/1     Running   0          18s

001 只能分配到带有disktype: "ssd" 此标签的节点上


#yaml
apiVersion: v1
kind: Pod
metadata:
  name: pod-nodeselector2
spec:
  nodeSelector:
    gpu: "nvidia"
  containers:
  - name: web
    image: nginx
-----------------------------------------------------------------------
[root@k8smaster pod]# kubectl apply -f pod-nodeselector2.yaml 
pod/pod-nodeselector2 created

#无法分配
[root@k8smaster pod]# kubectl get pod
NAME                READY   STATUS    RESTARTS   AGE
pod-nodeselector2   0/1     Pending   0          10s

#原因
[root@k8smaster pod]# kubectl describe pod pod-nodeselector2
0/2 nodes are available: 2 node(s) didn't match node selector.

nodeAffinity概述

001 节点亲和性,与nodeSelector作用一样,但相比更灵活
002 优先分配符合条件的节点,实在不符合,可以将就
003 匹配有更多的逻辑组合,不只是字符串的完全相等
004 调度分为软策略和硬策略,而不是硬性要求
    • 硬(required):必须满足
    • 软(preferred):尝试满足,但不保证
005 操作符:In、NotIn、Exists、DoesNotExist、Gt、Lt    

nodeAffinity示例1-软策略

apiVersion: v1
kind: Pod
metadata:
  name: pod-node-affinity
spec:
  affinity:
    nodeAffinity:
      #requiredDuringSchedulingIgnoredDuringExecution:
      #  nodeSelectorTerms:
      #  - matchExpressions:
      #    - key: gpu
      #      operator: In
      #      values:
      #      - nvidia-tesla
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 1
        preference:
          matchExpressions:
          - key: gpu
            operator: In
            values:
            - nvidia
  containers:
  - name: web
    image: nginx
    
---------------------------------------------------------------------------
#虽然不符合要求,但是因为软策略,也将就运行
[root@k8smaster pod]# kubectl apply -f pod-node-affinity.yaml 
pod/pod-node-affinity created
[root@k8smaster pod]# kubectl get pod
NAME                READY   STATUS    RESTARTS   AGE
pod-node-affinity   1/1     Running   0          30s

nodeAffinity示例2-硬策略

apiVersion: v1
kind: Pod
metadata:
  name: pod-node-affinity2
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: gpu
            operator: In
            values:
            - nvidia
      #preferredDuringSchedulingIgnoredDuringExecution:
      #- weight: 1
      #  preference:
      #    matchExpressions:
      #    - key: gpu
      #      operator: In
      #      values:
      #      - nvidia
  containers:
  - name: web
    image: nginx
    
------------------------------------------------------------------------------
[root@k8smaster pod]# kubectl apply -f pod-node-affinity2.yaml 
pod/pod-node-affinity2 created
#无法分配
[root@k8smaster pod]# kubectl get pod
NAME                 READY   STATUS    RESTARTS   AGE
pod-node-affinity2   0/1     Pending   0          5s
#原因
[root@k8smaster pod]# kubectl describe pod pod-node-affinity2
0/2 nodes are available: 2 node(s) didn't match node selector.

Taint(污点)与Tolerations(污点容忍)

相关概述

Taints
001 避免Pod调度到特定Node上
002 node上打了污点,不会分配任何的pod,除非配置容忍污点
Tolerations
001 允许Pod调度到持有Taints的Node上
002 一种悲观算法,只要有污点的节点,全部拒绝,但是可以给某些还污点的节点配置污点容忍,就可以分配

应用场景

001 专用节点:根据业务线将Node分组管理,希望在默认情况下不调度该节点,只有配置了污点容忍才允许分配
002 配备特殊硬件:部分Node配有SSD硬盘、GPU,希望在默认情况下不调度该节点,只有配置了污点容忍才允许分配
003 基于Taint的驱逐

给节点添加污点

格式:kubectl taint node [node] key=value:[effect] 

其中[effect] 可取值:
• NoSchedule :一定不能被调度
• PreferNoSchedule:尽量不要调度,非必须配置容忍
• NoExecute:不仅不会调度,还会驱逐Node上已有的Pod

#添加污点
[root@k8smaster pod]# kubectl taint node k8snode1 gpu=yes:NoSchedule 
node/k8snode1 tainted
#查看污点
[root@k8smaster pod]# kubectl describe node k8snode1 |grep Taint  
Taints:             gpu=yes:NoSchedule

#去掉污点:
kubectl taint node [node] key:[effect]-
[root@k8smaster pod]# kubectl taint node k8snode1 gpu-
node/k8snode1 untainted
[root@k8smaster pod]# kubectl describe node k8snode1 |grep Taint  
Taints:             <none>

示例:测试污点及容忍污点

apiVersion: v1
kind: Pod
metadata:
  name: pod2
spec:
  containers:
  - name: web
    image: nginx
    
-------------------------------------------------------------------------------------
#无法分配
[root@k8smaster pod]# kubectl get pod
NAME   READY   STATUS    RESTARTS   AGE
pod2   0/1     Pending   0          7s
[root@k8smaster pod]# kubectl describe pod pod2
0/2 nodes are available: 1 node(s) had taint {gpu: yes}, that the pod didn't tolerate, 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.


#配置污点容忍
apiVersion: v1
kind: Pod
metadata:
  name: pod3
spec:
  tolerations:
  - key: "gpu"
    operator: "Equal"
    value: "nvidia"
    effect: "NoSchedule"
  containers:
  - name: web
    image: nginx
---------------------------------------------------------------------
#无法分配,因为值不匹配
[root@k8smaster pod]# kubectl get pod
NAME   READY   STATUS    RESTARTS   AGE
pod3   0/1     Pending   0          4s

#配置污点容忍
apiVersion: v1
kind: Pod
metadata:
  name: pod5
spec:
  tolerations:
  - key: "gpu"
    operator: "Equal"
    value: "yes"
    effect: "NoSchedule"
  containers:
  - name: web
    image: nginx

-------------------------------------------------------------------
[root@k8smaster pod]# kubectl apply -f pod5.yaml  
pod/pod5 created
#成功分配
[root@k8smaster pod]# kubectl get pod
NAME   READY   STATUS    RESTARTS   AGE
pod5   1/1     Running   0          29s

nodeName

相关概念

001 指定节点名称,用于将Pod调度到指定的Node上,不经过调度器
002 即使节点上有污点,有可以调度

示例

apiVersion: v1
kind: Pod
metadata:
  name: pod6
spec:
  nodeName: k8snode1
  containers:
  - name: web
    image: nginx
-------------------------------------------------------------
#k8snode1带有污点
[root@k8smaster pod]# kubectl describe node k8snode1 |grep Taint  
Taints:             gpu=yes:NoSchedule

[root@k8smaster pod]# kubectl apply -f pod6.yaml 
pod/pod6 created
#成功分配
[root@k8smaster pod]# kubectl get pod
NAME   READY   STATUS    RESTARTS   AGE
pod6   1/1     Running   0          5s

相关文章

网友评论

      本文标题:Pod调度(指定节点、污点、亲和性)

      本文链接:https://www.haomeiwen.com/subject/peittrtx.html