Friday, August 7, 2020

Checklist While Troubleshooting Workload Errors in Kubernetes

 Following is the checklist while troubleshooting workload/application errors in Kubernetes:

1- First check how many nodes are there

2- What namespaces are present

3- In which namespace , the faulty application is

4- Now check faulty app belongs to which deployment

5- Now check which replicaset (if any) is party of that deployment

6- Then check which pods are part of that replicaset

7- Then check which services are part of that namespace

8- Then check which service correspond to the deployment where our faulty application is 

9- Then make sure label selectors in deployment to pod template are correct

10- Then ensure label selector in service to deployment are correct.

11- Then check that servicename if referred in any deployment is correct. For example, webserver pod is referring to database host (which will be the servicename of database) in env of pod template is correct.

12- Then check that ports are correct in clusterIP or nodeport services. 

13- Check if the status of pod is running

14- check logs of pods and containers

I hope that helps and feel free to add any step or thought in the comments. Thanks.

No comments: