k8s探针
k8s的探针支持三种检测方式
- 使用GET的方式去访问容器的ip端口地址,通过返回结果(状态码、是否响应)来判断这个应用的状态
- 尝试与容器的某个端口建立TCP连接
- Exec探针会去容器执行某个命令,并根据退出状态码来判断命令的执行情况
在本篇中以jenkins的官方yaml为演示例子
apiVersion: apps/v1
kind: Deployment
metadata:
name: jenkins
namespace: devops-tools
spec:
replicas: 1
selector:
matchLabels:
app: jenkins-server
template:
metadata:
labels:
app: jenkins-server
spec:
securityContext:
fsGroup: 1000 #附属组1000
runAsUser: 1000 #容器内所有进程都以 ID 1000来运行
serviceAccountName: jenkins-admin #为这个应用设定上面创建的那个账号
containers:
- name: jenkins
image: jenkins/jenkins:lts
resources: #资源限制
limits:
memory: "2Gi"
cpu: "1000m"
requests:
memory: "500Mi"
cpu: "500m"
ports:
- name: httpport
containerPort: 8080
- name: jnlpport
containerPort: 50000
livenessProbe: #存活探针
httpGet:
path: "/login"
port: 8080
initialDelaySeconds: 90
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 5
readinessProbe: #就绪探针
httpGet:
path: "/login"
port: 8080
initialDelaySeconds: 60
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
volumeMounts:
- name: jenkins-data
mountPath: /var/jenkins_home
volumes:
- name: jenkins-data
persistentVolumeClaim:
claimName: jenkins-pv-claim
存活探针
存活探针会定期去查看容器内的应用是否还或者,如果死球了,他将会去试着重启容器
因探针导致的重启将由node上的kubelet来负责,master上的conrtol plane不会去处理
livenessProbe: #存活探针
httpGet: #使用get的方式
path: "/login" #路径
port: 8080 #端口
initialDelaySeconds: 90 #在第一次探测前应该等待90s
periodSeconds: 10 #每十秒检测一次
timeoutSeconds: 5 #探测超时后等待多久
failureThreshold: 5 #探测失败多少次后触发处理动作
我们试着通过手动的方式来测试下这个探针
[root@master ~]# kubectl get pods -n devops-tools
NAME READY STATUS RESTARTS AGE
jenkins-559d8cd85c-qg9zx 1/1 Running 0 15d
[root@master ~]# kubectl exec -it -n devops-tools jenkins-559d8cd85c-qg9zx -c jenkins -- bash
jenkins@jenkins-559d8cd85c-qg9zx:/$ ls -l /proc/*/exe
lrwxrwxrwx 1 jenkins jenkins 0 Jan 13 08:51 /proc/1/exe -> /sbin/tini
lrwxrwxrwx 1 jenkins jenkins 0 Jan 29 03:06 /proc/3465/exe -> /bin/bash
lrwxrwxrwx 1 jenkins jenkins 0 Jan 29 05:08 /proc/3515/exe -> /bin/bash
lrwxrwxrwx 1 jenkins jenkins 0 Jan 13 08:51 /proc/7/exe -> /opt/java/openjdk/bin/java
lrwxrwxrwx 1 jenkins jenkins 0 Jan 29 05:08 /proc/self/exe -> /bin/ls
jenkins@jenkins-559d8cd85c-qg9zx:/$ kill -9 /opt/java/openjdk/bin/java
bash: kill: /opt/java/openjdk/bin/java: arguments must be process or job IDs
jenkins@jenkins-559d8cd85c-qg9zx:/$ kill -9 7
jenkins@jenkins-559d8cd85c-qg9zx:/$ command terminated with exit code 137
[root@master ~]# kubectl get pods -n devops-tools
NAME READY STATUS RESTARTS AGE
jenkins-559d8cd85c-qg9zx 0/1 Running 1 (19s ago) 15d
#等一会
[root@master ~]# kubectl get pods -n devops-tools
NAME READY STATUS RESTARTS AGE
jenkins-559d8cd85c-qg9zx 1/1 Running 1 (119s ago) 15d
[root@master ~]# kubectl describe pod jenkins-559d8cd85c-qg9zx -n devops-tools
Name: jenkins-559d8cd85c-qg9zx
Namespace: devops-tools
Priority: 0
Node: worker-node01/172.30.254.88
Start Time: Fri, 13 Jan 2023 16:51:36 +0800
Labels: app=jenkins-server
pod-template-hash=559d8cd85c
Annotations: <none>
Status: Running
IP:
IPs:
IP:
Controlled By: ReplicaSet/jenkins-559d8cd85c
Containers:
jenkins:
Container ID: docker://d7342bb100c43d330bfb84b042b4a13317d6fe5ec82c6c44c90dd90502d1dacf
Image: jenkins/jenkins:lts
Image ID: docker-pullable://jenkins/jenkins@sha256:c1d02293a08ba69483992f541935f7639fb10c6c322785bdabaf7fa94cd5e732
Ports: 8080/TCP, 50000/TCP
Host Ports: 0/TCP, 0/TCP
State: Running
Started: Sun, 29 Jan 2023 13:12:38 +0800
Last State: Terminated #中断
Reason: Error
Exit Code: 137 #137意味着128+x,x是SIGKILL信号,在这里是9,表示这个进程被强行终止了
Started: Fri, 13 Jan 2023 16:51:37 +0800
Finished: Sun, 29 Jan 2023 13:12:37 +0800
Ready: True
Restart Count: 1
就绪探针
就绪探针会定期去访问容器内部,当容器准备的就绪探测返回成功时,就代表容器已经做好了接受请求的准备
一个程序怎样情况下表示就绪是设计他的程序员的问题,k8s不过是简单的去戳一下容器里面被设定好的路径来看一下返回状态罢了
启动容器时,可以设置一个等待时间,等过了这个设定的时间后,k8s才会去执行第一次探测
与存活不同的是 ,就绪探针的失败并不会使得容器重启,但是一个就绪探测失败的pod将无法继续接收请求,但是这个特性也很适合那些需要很长时间才能启动就绪服务--如果这样的一个服务刚被布上去还没准备完毕,就被转发来了流量,那客户端将会收获一堆报错
readinessProbe: #就绪探针
httpGet:
path: "/login"
port: 8080
initialDelaySeconds: 60 #启动等待
periodSeconds: 10 #间隔
timeoutSeconds: 5 #超时时间
failureThreshold: 3 #失败次数
评论已关闭