Starting to Learn Kubernetes a Step Behind - 05. workloads Part 1 -

Story

Last Time

In Starting to Learn Kubernetes a Step Behind - 04. kubectl -, we learned about kubectl, the CLI tool for Kubernetes. This time, we will learn about workloads, a key feature.

Workloads

In Kubernetes, there are types of resources as follows. This time, we will learn about Workloads.

Classification of resources	Content
Workloads resources	Resources related to the execution of containers
Discovery & LB resources	Resources that provide endpoints to expose containers externally
Config & Storage resources	Resources related to configuration, secret information, persistent volumes, etc.
Cluster resources	Resources related to security and quotas
Metadata resources	Resources for operating other resources

※ Kubernetes Workloads Resources (Part 1)

There are 8 types of Workloads.

Pod
ReplicationController
ReplicaSet
Deployment
DaemonSet
StatefulSet
Job
CronJob

We will look at Pod, ReplicationController, ReplicaSet, and Deployment.

Pod

The smallest unit of resource that includes one or more containers. Each Pod is assigned an IP address. Volumes are shared. Basically, it's not about stuffing containers into a Pod, but "if you can separate, separate" seems to be a good policy for microservices. Let's get it running.

※ alias k=kubectl

# sample-2pod.yaml
apiVersion: v1
kind: Pod
metadata:
  name: sample-2pod
spec:
  containers:
    - name: nginx-container
      image: nginx:1.12
    - name: redis-container
      image: redis:3.2

pi@raspi001:~/tmp $ k apply -f . --prune --all
pod/sample-2pod created
pi@raspi001:~/tmp $ k get pod sample-2pod
NAME          READY   STATUS    RESTARTS   AGE
sample-2pod   2/2     Running   0          101s

As expected, multiple containers are running. (READY 2/2) What happens when you enter with exec?

pi@raspi001:~/tmp $ k exec -it sample-2pod /bin/sh
Defaulting container name to nginx-container.
Use 'kubectl describe pod/sample-2pod -n default' to see all of the containers in this pod.
#

I see, it seems to enter the default container (the first one in spec.containers). To enter the redis-container,

pi@raspi001:~/tmp $ k exec -it sample-2pod -c redis-container /bin/sh
# redis-cli
127.0.0.1:6379> exit
#

It seems that you just need to specify the container with -c. There are other things I want to explain, but it's getting long, so I'll wrap it up.

ReplicaSet, ReplicationController

As the name suggests, these are resources that replicate Pods. Due to historical reasons, the name was changed from ReplicationController to ReplicaSet, so it is recommended to use ReplicaSet.

Let's get it running.

# sample-rs.yaml
apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: sample-rs
spec:
  replicas: 3
  selector:
    matchLabels:
      app: sample-app
  template:
    metadata:
      labels:
        app: sample-app
    spec:
      containers:
        - name: nginx-container
          image: nginx:1.12
        - name: redis-container
          image: redis:3.2

pi@raspi001:~/tmp $ k apply -f . --prune --all
replicaset.apps/sample-rs created
pod/sample-2pod unchanged
pi@raspi001:~/tmp $ k get pods
NAME              READY   STATUS              RESTARTS   AGE
sample-2pod       2/2     Running             0          20m
sample-rs-ghkcc   2/2     Running             0          103s
sample-rs-nsc5b   0/2     ContainerCreating   0          103s
sample-rs-wk7vl   0/2     ContainerCreating   0          103s

Indeed, there are three replicas (sample-rs), each with two containers (READY 2/2). What I noticed while writing is that the apiVersion of the pod is v1, while the apiVersion of the replicaSet is apps/v1. So I looked it up and found an article called What to write in Kubernetes apiVersion. It seems that v1 is fine for core features.

Let's try the self-healing feature of Kubernetes, which is one of its key features.

pi@raspi001:~/tmp $ k get pods
NAME              READY   STATUS    RESTARTS   AGE
sample-2pod       2/2     Running   0          29m
sample-rs-ghkcc   2/2     Running   0          11m
sample-rs-nsc5b   2/2     Running   0          11m
sample-rs-wk7vl   2/2     Running   0          11m
pi@raspi001:~/tmp $ k delete pod sample-rs-wk7vl
pod "sample-rs-wk7vl" deleted
pi@raspi001:~/tmp $ k get pods
NAME              READY   STATUS              RESTARTS   AGE
sample-2pod       2/2     Running             0          30m
sample-rs-ghkcc   2/2     Running             0          11m
sample-rs-gq2hs   0/2     ContainerCreating   0          13s
sample-rs-nsc5b   2/2     Running             0          11m

Oh, it's in ContainerCreating. That's good. By the way, I was wondering what would happen if the node itself failed and went down. Let's try it.

pi@raspi001:~/tmp $ k get pods -o=wide
NAME              READY   STATUS    RESTARTS   AGE    IP            NODE       NOMINATED NODE   READINESS GATES
sample-2pod       2/2     Running   0          32m    10.244.1.25   raspi002   <none>           <none>
sample-rs-ghkcc   2/2     Running   0          13m    10.244.1.26   raspi002   <none>           <none>
sample-rs-gq2hs   2/2     Running   0          114s   10.244.1.27   raspi002   <none>           <none>
sample-rs-nsc5b   2/2     Running   0          13m    10.244.2.15   raspi003   <none>           <none>

Let's turn off the power to raspi003.

Move to worker(raspi003)

~ $ slogin pi@raspi003.local
pi@raspi003.local's password:
pi@raspi003:~ $ sudo shutdown now
sudo: unable to resolve host raspi003
Connection to raspi003.local closed by remote host.
Connection to raspi003.local closed.
~ $

Move to master(raspi001)

pi@raspi001:~/tmp $ k get nodes
NAME       STATUS     ROLES    AGE     VERSION
raspi001   Ready      master   5d16h   v1.14.1
raspi002   Ready      worker   5d16h   v1.14.1
raspi003   NotReady   worker   4d21h   v1.14.1
pi@raspi001:~/tmp $ k get pods -o=wide
NAME              READY   STATUS    RESTARTS   AGE     IP            NODE       NOMINATED NODE   READINESS GATES
sample-2pod       2/2     Running   0          35m     10.244.1.25   raspi002   <none>           <none>
sample-rs-ghkcc   2/2     Running   0          17m     10.244.1.26   raspi002   <none>           <none>
sample-rs-gq2hs   2/2     Running   0          5m38s   10.244.1.27   raspi002   <none>           <none>
sample-rs-nsc5b   2/2     Running   0          17m     10.244.2.15   raspi003   <none>           <none>

Huh? Is it running on raspi003? A few tens of seconds later...

pi@raspi001:~/kubernetes-perfect-guide/samples/chapter05/tmp $ k get pods -o=wide
NAME              READY   STATUS        RESTARTS   AGE   IP            NODE       NOMINATED NODE   READINESS GATES
sample-2pod       2/2     Running       0          40m   10.244.1.25   raspi002   <none>           <none>
sample-rs-ghkcc   2/2     Running       0          22m   10.244.1.26   raspi002   <none>           <none>
sample-rs-gq2hs   2/2     Running       0          10m   10.244.1.27   raspi002   <none>           <none>
sample-rs-nsc5b   2/2     Terminating   0          22m   10.244.2.15   raspi003   <none>           <none>
sample-rs-p2jsc   2/2     Running       0          53s   10.244.1.28   raspi002   <none>           <none>

Oh, as expected, the pod on raspi003 disappeared and was recreated on raspi002. sample-rs-nsc5b will continue to remain because the node is down and cannot be deleted.

Does it seem to take a bit of time to wait?

According to the article What does Kubernetes do when there is a failure in the cluster?, it seems that the kube-controller-manager detects it and the kube-scheduler aligns it to the correct number. Was I made to wait for a few tens of seconds because of the detection interval?

In the options of kube-controller-manager, it says --attach-detach-reconcile-sync-period duration Default: 1m0s. Is it an interval of 1 minute?

I want to prevent Pods from running on a specific Node

Can such a request be fulfilled?

According to Assigning Pods to Nodes, you can specify the node to be assigned with the nodeSelector field. (Not exclusion, but specification) However, according to Editing nodeSelector doesn't rearrange pods in ReplicaSet, this should be done in deployment, not in replicaSet. I'll try it just in case to see if it works with replicaSet.

First, restart and boot up raspi003, which I turned off earlier. Then, move to master(raspi001).

pi@raspi001:~/tmp $ k label nodes raspi002 type=AWS
node/raspi002 labeled
pi@raspi001:~/tmp $ k label nodes raspi003 type=GCP
node/raspi003 labeled
pi@raspi001:~/tmp $ k get nodes -L type
NAME       STATUS   ROLES    AGE     VERSION   TYPE
raspi001   Ready    master   5d17h   v1.14.1
raspi002   Ready    worker   5d17h   v1.14.1   AWS
raspi003   Ready    worker   4d21h   v1.14.1   GCP
pi@raspi001:~/tmp $ k get pods -o=wide
NAME              READY   STATUS    RESTARTS   AGE   IP            NODE       NOMINATED NODE   READINESS GATES
sample-2pod       2/2     Running   0          75m   10.244.1.25   raspi002   <none>           <none>
sample-rs-ghkcc   2/2     Running   0          56m   10.244.1.26   raspi002   <none>           <none>
sample-rs-gq2hs   2/2     Running   0          44m   10.244.1.27   raspi002   <none>           <none>
sample-rs-p2jsc   2/2     Running   0          35m   10.244.1.28   raspi002   <none>           <none>

I've labeled the node to make it easier to use nodeSelector. Since all sample-rs are running on raspi002, I'll try the following.

Configure sample-rs to run only on raspi002
Shut down raspi002

As a result, I expect that "sample-rs will not self-heal because raspi002 is not running".

# sample-rs.yaml
apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: sample-rs
spec:
  replicas: 3
  selector:
    matchLabels:
      app: sample-app
  template:
    metadata:
      labels:
        app: sample-app
    spec:
      containers:
        - name: nginx-container
          image: nginx:1.12
        - name: redis-container
          image: redis:3.2
      nodeSelector:
        type: AWS

pi@raspi001:~/tmp $ k apply -f . --prune --all
replicaset.apps/sample-rs configured
pod/sample-2pod unchanged

I added a nodeSelector. This is a simple specification, so it's fine this time, but if you want to specify more flexibly, you should use nodeAffinity.

Move to worker(raspi002)

~ $ slogin pi@raspi002.local
pi@raspi002.local's password:
pi@raspi002:~ $ sudo shutdown now
sudo: unable to resolve host raspi002
Connection to raspi002.local closed by remote host.
Connection to raspi002.local closed.
~ $

Wait a few tens of seconds... The result is...!

Move to master(raspi001)

pi@raspi001:~/tmp $ k get nodes -L type
NAME       STATUS     ROLES    AGE     VERSION   TYPE
raspi001   Ready      master   5d17h   v1.14.1
raspi002   NotReady   worker   5d17h   v1.14.1   AWS
raspi003   Ready      worker   4d22h   v1.14.1   GCP
pi@raspi001:~/tmp $ k get pods -o=wide
NAME              READY   STATUS        RESTARTS   AGE   IP            NODE       NOMINATED NODE   READINESS GATES
sample-2pod       2/2     Terminating   0          89m   10.244.1.25   raspi002   <none>           <none>
sample-rs-4srpp   0/2     Pending       0          36s   <none>        <none>     <none>           <none>
sample-rs-6mgcr   0/2     Pending       0          37s   <none>        <none>     <none>           <none>
sample-rs-ghkcc   2/2     Terminating   0          71m   10.244.1.26   raspi002   <none>           <none>
sample-rs-gq2hs   2/2     Terminating   0          59m   10.244.1.27   raspi002   <none>           <none>
sample-rs-lc225   0/2     Pending       0          36s   <none>        <none>     <none>           <none>
sample-rs-p2jsc   2/2     Terminating   0          49m   10.244.1.28   raspi002   <none>           <none>

It worked as expected. In other words, since sample-rs can only be rebuilt on raspi002, it is in a Pending, Terminating state. Also, the simple pod, sample-2pod, is not a replicaSet, so it is not self-healing and is in a Terminating state. Interesting, isn't it?

Deployment

Manages multiple ReplicaSets. There are features such as "rolling update, rollback" that are not in ReplicaSet. Deployment is the most recommended resource type, not Pod or ReplicaSet.

What happens if you update the specified container image in ReplicaSet (update)? Will all be updated, or only some? Let's try it.

The nginx image of sample-2pod-replica.yaml has been updated from 1.12 to 1.13.

pi@raspi001:~/tmp $ k get all
NAME                  READY   STATUS    RESTARTS   AGE
pod/sample-rs-4srpp   2/2     Running   0          7h14m
pod/sample-rs-6mgcr   2/2     Running   0          7h14m
pod/sample-rs-lc225   2/2     Running   0          7h14m

NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
service/kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   6d

NAME                        DESIRED   CURRENT   READY   AGE
replicaset.apps/sample-rs   3         3         3       8h
pi@raspi001:~/tmp $ k apply -f . --prune --all
replicaset.apps/sample-rs configured
pod/sample-2pod created
pi@raspi001:~/tmp $ k describe replicaset sample-rs
Name:         sample-rs
...
  Containers:
   nginx-container:
    Image:        nginx:1.13
...

The manifest of replicaset has been updated.

pi@raspi001:~/tmp $ k describe pod sample-rs-4srpp
Name:               sample-rs-4srpp
...
  nginx-container:
    Container ID:   docker://9160f550ee9d9bbcd1a5c990ca95389b2b39aff6688bcd933c99fe93b1968b99
    Image:          nginx:1.12
...

There seems to be no change in the pod. Now, let's use Deployment.

# sample-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: sample-deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      app: sample-app
  template:
    metadata:
      labels:
        app: sample-app
    spec:
      containers:
        - name: nginx-container
          image: nginx:1.12
          ports:
            - containerPort: 80

pi@raspi001:~/tmp $ k apply -f . --prune --all --record
replicaset.apps/sample-rs configured
pod/sample-2pod configured
deployment.apps/sample-deployment created

By adding --record, you can keep a history. This is used for rollback.

pi@raspi001:~/tmp $ k get all
NAME                                    READY   STATUS    RESTARTS   AGE
pod/sample-2pod                         2/2     Running   0          12m
pod/sample-deployment-6cd85bd5f-4whgn   1/1     Running   0          119s
pod/sample-deployment-6cd85bd5f-js2sw   1/1     Running   0          119s
pod/sample-deployment-6cd85bd5f-mjt77   1/1     Running   0          119s
pod/sample-rs-4srpp                     2/2     Running   0          7h28m
pod/sample-rs-6mgcr                     2/2     Running   0          7h28m
pod/sample-rs-lc225                     2/2     Running   0          7h28m

NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
service/kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   6d1h

NAME                                READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/sample-deployment   3/3     3            3           2m

NAME                                          DESIRED   CURRENT   READY   AGE
replicaset.apps/sample-deployment-6cd85bd5f   3         3         3       2m
replicaset.apps/sample-rs                     3         3         3       8h

sample-deployment has created deployment, replicaset, and pod.

Now, let's update the nginx container of sample-deployment from 1.12 to 1.13.

# sample-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: sample-deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      app: sample-app
  template:
    metadata:
      labels:
        app: sample-app
    spec:
      containers:
        - name: nginx-container
          image: nginx:1.13
          ports:
            - containerPort: 80

pi@raspi001:~/tmp $ k apply -f . --prune --all --record
replicaset.apps/sample-rs unchanged
pod/sample-2pod unchanged
deployment.apps/sample-deployment configured
pi@raspi001:~/tmp $ k get pod
NAME                                 READY   STATUS              RESTARTS   AGE
sample-2pod                          2/2     Running             0          15m
sample-deployment-6cd85bd5f-js2sw    1/1     Running             0          4m53s
sample-deployment-6cd85bd5f-mjt77    1/1     Running             0          4m53s
sample-deployment-7dfb996c6b-gh2cg   0/1     ContainerCreating   0          21s
sample-deployment-7dfb996c6b-m4wrd   1/1     Running             0          38s
sample-rs-4srpp                      2/2     Running             0          7h31m
sample-rs-6mgcr                      2/2     Running             0          7h31m
sample-rs-lc225                      2/2     Running             0          7h31m

Oh, the pod of the deployment is being rebuilt. This is a rolling update. A rolling update is considered to have changed when spec.template is updated. Also, rollback can be done with the rollout command, and you can also go back with revision specification. However, you should basically revert the manifest and apply it.

There is something called an update strategy, and the default is RollingUpdate. It is an update strategy that considers the excess and deficiency of Pod. You can set the allowable deficiency and excess during the update. (maxUnavailable, maxSurge) Another strategy is the Recreate strategy. This one rebuilds all at once. Therefore, it will be temporarily inaccessible.

One thing I was worried about was, "If I update the version of the frontend from 1 to 2, will it be okay if a user who accessed the version 1 container transitions to the version 2 container?" However, this is not limited to rolling updates, so I decided not to think about it. It's a good story if you design it properly.

By the way, you can do deployment without writing a manifest. It's k run sample-deployment-cli --image nginx:1.12 --replicas 3 --port 80. It's convenient for trying out.

Cleanup

I tried deleting with prune.

pi@raspi001:~/tmp $ ls
sample-2pod-replica.yaml  sample-2pod.yaml  sample-deployment.yaml
pi@raspi001:~/tmp $ mv sample-2pod-replica.yaml sample-2pod-replica.yaml.org
pi@raspi001:~/tmp $ mv sample-deployment.yaml sample-deployment.yaml.org
pi@raspi001:~/tmp $ k apply -f . --all --prune
pod/sample-2pod configured
deployment.apps/sample-deployment pruned
replicaset.apps/sample-rs pruned
pi@raspi001:~/tmp $ k get all
NAME              READY   STATUS    RESTARTS   AGE
pod/sample-2pod   2/2     Running   0          30m

NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
service/kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   6d1h

Hmm, I can delete it this way, but I always leave one file... . If everything is org, k apply -f . fails... .

pi@raspi001:~/tmp $ k delete pod sample-2pod
pod "sample-2pod" deleted

In the end, I did this... .

pi@raspi001:~/tmp $ k label node raspi002 type-
pi@raspi001:~/tmp $ k label node raspi003 type-

Conclusion

I got more into ReplicaSet than I thought. Next, I will try the remaining workloads. The next one is here.

Sure, please paste the Markdown content you want to translate.

If it was helpful, support me with a ☕!

Related tags

Kubernetes