k8s的探针支持三种检测方式

  1. 使用GET的方式去访问容器的ip端口地址,通过返回结果(状态码、是否响应)来判断这个应用的状态
  2. 尝试与容器的某个端口建立TCP连接
  3. Exec探针会去容器执行某个命令,并根据退出状态码来判断命令的执行情况

https://kubernetes.io/zh-cn/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/

在本篇中以jenkins的官方yaml为演示例子

apiVersion: apps/v1
kind: Deployment
metadata:
  name: jenkins
  namespace: devops-tools
spec:
  replicas: 1
  selector:
    matchLabels:
      app: jenkins-server
  template:
    metadata:
      labels:
        app: jenkins-server
    spec:
      securityContext:
            fsGroup: 1000 #附属组1000
            runAsUser: 1000 #容器内所有进程都以 ID 1000来运行
      serviceAccountName: jenkins-admin #为这个应用设定上面创建的那个账号
      containers:
        - name: jenkins
          image: jenkins/jenkins:lts
          resources: #资源限制
            limits:
              memory: "2Gi"
              cpu: "1000m"
            requests:
              memory: "500Mi"
              cpu: "500m"
          ports:
            - name: httpport
              containerPort: 8080
            - name: jnlpport
              containerPort: 50000
          livenessProbe: #存活探针
            httpGet:
              path: "/login"
              port: 8080
            initialDelaySeconds: 90
            periodSeconds: 10
            timeoutSeconds: 5
            failureThreshold: 5
          readinessProbe: #就绪探针
            httpGet:
              path: "/login"
              port: 8080
            initialDelaySeconds: 60
            periodSeconds: 10
            timeoutSeconds: 5
            failureThreshold: 3
          volumeMounts:
            - name: jenkins-data
              mountPath: /var/jenkins_home
      volumes:
        - name: jenkins-data
          persistentVolumeClaim:
              claimName: jenkins-pv-claim

存活探针

存活探针会定期去查看容器内的应用是否还或者,如果死球了,他将会去试着重启容器
因探针导致的重启将由node上的kubelet来负责,master上的conrtol plane不会去处理

          livenessProbe: #存活探针
            httpGet: #使用get的方式
              path: "/login" #路径
              port: 8080     #端口
            initialDelaySeconds: 90 #在第一次探测前应该等待90s
            periodSeconds: 10 #每十秒检测一次
            timeoutSeconds: 5 #探测超时后等待多久
            failureThreshold: 5 #探测失败多少次后触发处理动作

我们试着通过手动的方式来测试下这个探针

[root@master ~]# kubectl get pods -n devops-tools
NAME                       READY   STATUS    RESTARTS   AGE
jenkins-559d8cd85c-qg9zx   1/1     Running   0          15d

[root@master ~]# kubectl exec -it -n devops-tools jenkins-559d8cd85c-qg9zx -c jenkins -- bash
jenkins@jenkins-559d8cd85c-qg9zx:/$ ls -l /proc/*/exe
lrwxrwxrwx 1 jenkins jenkins 0 Jan 13 08:51 /proc/1/exe -> /sbin/tini
lrwxrwxrwx 1 jenkins jenkins 0 Jan 29 03:06 /proc/3465/exe -> /bin/bash
lrwxrwxrwx 1 jenkins jenkins 0 Jan 29 05:08 /proc/3515/exe -> /bin/bash
lrwxrwxrwx 1 jenkins jenkins 0 Jan 13 08:51 /proc/7/exe -> /opt/java/openjdk/bin/java
lrwxrwxrwx 1 jenkins jenkins 0 Jan 29 05:08 /proc/self/exe -> /bin/ls

jenkins@jenkins-559d8cd85c-qg9zx:/$ kill -9 /opt/java/openjdk/bin/java
bash: kill: /opt/java/openjdk/bin/java: arguments must be process or job IDs
jenkins@jenkins-559d8cd85c-qg9zx:/$ kill -9 7
jenkins@jenkins-559d8cd85c-qg9zx:/$ command terminated with exit code 137
[root@master ~]# kubectl get pods -n devops-tools
NAME                       READY   STATUS    RESTARTS      AGE
jenkins-559d8cd85c-qg9zx   0/1     Running   1 (19s ago)   15d

#等一会

[root@master ~]# kubectl get pods -n devops-tools
NAME                       READY   STATUS    RESTARTS       AGE
jenkins-559d8cd85c-qg9zx   1/1     Running   1 (119s ago)   15d

[root@master ~]# kubectl describe pod jenkins-559d8cd85c-qg9zx -n devops-tools
Name:         jenkins-559d8cd85c-qg9zx
Namespace:    devops-tools
Priority:     0
Node:         worker-node01/172.30.254.88
Start Time:   Fri, 13 Jan 2023 16:51:36 +0800
Labels:       app=jenkins-server
              pod-template-hash=559d8cd85c
Annotations:  <none>
Status:       Running
IP:           
IPs:
  IP:           
Controlled By:  ReplicaSet/jenkins-559d8cd85c
Containers:
  jenkins:
    Container ID:   docker://d7342bb100c43d330bfb84b042b4a13317d6fe5ec82c6c44c90dd90502d1dacf
    Image:          jenkins/jenkins:lts
    Image ID:       docker-pullable://jenkins/jenkins@sha256:c1d02293a08ba69483992f541935f7639fb10c6c322785bdabaf7fa94cd5e732
    Ports:          8080/TCP, 50000/TCP
    Host Ports:     0/TCP, 0/TCP
    State:          Running
      Started:      Sun, 29 Jan 2023 13:12:38 +0800
    Last State:     Terminated #中断
      Reason:       Error
      Exit Code:    137        #137意味着128+x,x是SIGKILL信号,在这里是9,表示这个进程被强行终止了
      Started:      Fri, 13 Jan 2023 16:51:37 +0800
      Finished:     Sun, 29 Jan 2023 13:12:37 +0800
    Ready:          True
    Restart Count:  1

就绪探针

就绪探针会定期去访问容器内部,当容器准备的就绪探测返回成功时,就代表容器已经做好了接受请求的准备
一个程序怎样情况下表示就绪是设计他的程序员的问题,k8s不过是简单的去戳一下容器里面被设定好的路径来看一下返回状态罢了
启动容器时,可以设置一个等待时间,等过了这个设定的时间后,k8s才会去执行第一次探测
与存活不同的是 ,就绪探针的失败并不会使得容器重启,但是一个就绪探测失败的pod将无法继续接收请求,但是这个特性也很适合那些需要很长时间才能启动就绪服务--如果这样的一个服务刚被布上去还没准备完毕,就被转发来了流量,那客户端将会收获一堆报错

          readinessProbe: #就绪探针
            httpGet:
              path: "/login"
              port: 8080
            initialDelaySeconds: 60 #启动等待
            periodSeconds: 10 #间隔
            timeoutSeconds: 5 #超时时间
            failureThreshold: 3 #失败次数

标签: none

评论已关闭