Restart Policy と Probe を使った Pod の管理

Kubernetes には Restart Policy や Probe という設定や仕組みがある。
これらを適切に使うことで、コンテナが意図した通りに動いているのか、再起動させる必要はないのか、といったことを Kubernetes が継続的にチェックしてくれるようになる。そしてそれだけではなく、チェックした結果に応じて必要な対応も行ってくれるようになる。
開発者は用意した設定を Kubernetes に伝えればよく、そうすればあとは Kubernetes が自律的にコンテナを管理してくれる。

この記事では、Restart Policy や Probe をどのように設定すればよいのか、そしてその設定の結果どのように動作するのかについて、具体例を示しながら述べていく。

動作確認は以下の環境で行った。

Docker Desktop 4.22.1
Kubernetes 1.27.2

Restart Policy

Restart Policy は、Pod 内のコンテナが終了したときに再起動するかどうかの設定で、以下の 3 つのいずれかの値を持つ。

Always
- コンテナが終了すると常に再起動する
OnFailure
- コンテナが異常終了した場合にのみ再起動する
Never
- コンテナが終了しても再起動しない

なお、Deployment で管理している Pod は必ず Always になる。

実際にコンテナを停止させてみて、どのような挙動になるのか見てみる。

以下がサンプルコード。

import http from "http";
import { exit } from "node:process";

http
  .createServer(function ({ url }, res) {
    switch (url) {
      case "/": {
        res.writeHead(200, { "Content-Type": "text/plain" });
        res.end("Hello World\n");
        break;
      }
      case "/exit-0": {
        res.writeHead(500, { "Content-Type": "text/plain" });
        res.end("Exit by 0\n");
        exit(0);
      }
      case "/exit-1": {
        res.writeHead(500, { "Content-Type": "text/plain" });
        res.end("Exit by 1\n");
        exit(1);
      }
      case "/oom": {
        const hugeArray = [];
        for (let i = 0; ; i++) {
          hugeArray.push(i.toString().repeat(1000000));
        }
      }
      default: {
        res.writeHead(404, { "Content-Type": "text/plain" });
        res.end("Not Found\n");
        break;
      }
    }
  })
  .listen(3000);

このサンプルコードはウェブサーバを起動するが、各パスにリクエストを送ると以下の結果になる。

/exit-0
- レスポンスを返したあとにプロセスを正常終了する
/exit-1
- レスポンスを返したあとにプロセスを異常終了する
oom
- Out of memory（以下、OOM）が発生する

このウェブサーバを使って、コンテナがどうなるのか試していく。

上記のコードが動くコンテナイメージを、sampleという名前で作る。
Docker を使ったコンテナイメージの作り方は以前書いた。

numb86-tech.hatenablog.com

次はマニフェストファイルを書く。まずは Restart Policy をNeverにする。

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
  labels:
    app: node-app
spec:
  containers:
  - name: my-container
    image: sample:latest
    imagePullPolicy: IfNotPresent
    ports:
    - containerPort: 3000
    resources:
      limits:
        memory: 256Mi
  restartPolicy: Never # Restart Policy
---
apiVersion: v1
kind: Service
metadata:
  name: my-ser
spec:
  type: NodePort
  ports:
  - name: my-ser-port
    port: 8099
    targetPort: 3000
    nodePort: 32660
  selector:
    app: node-app

applyコマンドで設定を反映させる。

$ kubectl apply -f manifestfile.yaml
pod/my-pod created
service/my-ser created

これでNeverで Pod が作られた。

これからコンテナの状態がどのように変化するのかを見ていくが、まずは現在の状態を確認しておく。

コンテナの状態は$ kubectl get pod Podの名前 -o=jsonpath='{.status.containerStatuses}'で見れる。

$ kubectl get pod my-pod -o=jsonpath='{.status.containerStatuses}'
[{"containerID":"docker://0930986742e2e40ab13f09e300745fc1abbb369ea6a0ecf92047acd4a4de9d75","image":"sample:latest","imageID":"docker://sha256:806e7254cc070f24057a0dd4135349d77a5db11b860b4e788969192bf8bf51cc","lastState":{},"name":"my-container","ready":true,"restartCount":0,"started":true,"state":{"running":{"startedAt":"2023-09-20T16:14:30Z"}}}]```

見づらいので jq で整形する。また、今回見たい情報だけを抜粋して表示する。

$ kubectl get pod my-pod -o=jsonpath='{.status}' | jq '.containerStatuses[] | {state, lastState, ready, restartCount}'
{
  "state": {
    "running": {
      "startedAt": "2023-09-20T16:14:30Z"
    }
  },
  "lastState": {},
  "ready": true,
  "restartCount": 0
}

stateは以下の 3 つのうちのいずれかになる。

Running
- コンテナが正常に動作している
Terminated
- コンテナが終了した
Waiting
- Running でも Terminated でもない

readyはリクエストを処理できる状態であるのかを、restartCountはコンテナが再起動した回数を、それぞれ示している。

つまり、現時点でmy-pod内のコンテナは正常に動作しており、リクエストを処理することも可能、そしてまだ一度も再起動していないということが分かる。

/exit-0

まずは/exit-0にリクエストを送ると状態がどのように変化するのか見てみる。

$ curl localhost:32660/exit-0
Exit by 0

$ kubectl get pod my-pod -o=jsonpath='{.status}' | jq '.containerStatuses[] | {state, lastState, ready, restartCount}'
{
  "state": {
    "terminated": {
      "containerID": "docker://0930986742e2e40ab13f09e300745fc1abbb369ea6a0ecf92047acd4a4de9d75",
      "exitCode": 0,
      "finishedAt": "2023-09-20T16:39:24Z",
      "reason": "Completed",
      "startedAt": "2023-09-20T16:14:30Z"
    }
  },
  "lastState": {},
  "ready": false,
  "restartCount": 0
}

終了コード0でTerminatedとなり、readyもfalseになっている。reasonはCompleted。

Restart Policy はNeverなので、このコンテナはこのまま終了したままであり、再起動されない。

検証を続けるために一度リソースを削除して作り直す。以後、この方法で Pod を作り直していく。

$ kubectl delete -f manifestfile.yaml
pod "my-pod" deleted
service "my-ser" deleted

$ kubectl apply -f manifestfile.yaml
pod/my-pod created
service/my-ser created

/exit-1

次は/exit-1。

$ curl localhost:32660/exit-1
Exit by 1

$ kubectl get pod my-pod -o=jsonpath='{.status}' | jq '.containerStatuses[] | {state, lastState, ready, restartCount}'
{
  "state": {
    "terminated": {
      "containerID": "docker://6ad3a4b88de09b4beea968ae31ccea0b38f018b85b5ac9c89837a79646b22283",
      "exitCode": 1,
      "finishedAt": "2023-09-20T16:44:41Z",
      "reason": "Error",
      "startedAt": "2023-09-20T16:43:16Z"
    }
  },
  "lastState": {},
  "ready": false,
  "restartCount": 0
}

コンテナが終了（Terminated）しているのは先程と同じだが、終了コードが1に、reasonがErrorになっている。

/oom

最後は/oom。

$ curl localhost:32660/oom
curl: (52) Empty reply from server

$ kubectl get pod my-pod -o=jsonpath='{.status}' | jq '.containerStatuses[] | {state, lastState, ready, restartCount}'
{
  "state": {
    "terminated": {
      "containerID": "docker://a5452537ece6d0042ddcecb62d2270c40938046f320cd47f185a4cd301a95e4a",
      "exitCode": 137,
      "finishedAt": "2023-09-20T16:46:45Z",
      "reason": "OOMKilled",
      "startedAt": "2023-09-20T16:46:38Z"
    }
  },
  "lastState": {},
  "ready": false,
  "restartCount": 0
}

今度は終了コード137でreasonはOOMKilledになっている。

/exit-1でも/oomでもコンテナは再起動されず、停止したままになる。

OnFailure や Always による再起動

次は Restart Policy をOnFailureにして同様の操作をしてみる。

@@ -14,7 +14,7 @@ spec:
     resources:
       limits:
         memory: 256Mi
-  restartPolicy: Always # Restart Policy
+  restartPolicy: OnFailure # Restart Policy
 ---
 apiVersion: v1
 kind: Service

すると、/exit-1と/oomではコンテナが再起動されることを確認できる。

$ curl localhost:32660/exit-1
Exit by 1

$ kubectl get pod my-pod -o=jsonpath='{.status}' | jq '.containerStatuses[] | {state, lastState, ready, restartCount}'
{
  "state": {
    "running": {
      "startedAt": "2023-09-20T16:53:36Z"
    }
  },
  "lastState": {
    "terminated": {
      "containerID": "docker://b34991875c102e2903d4ae07f9abe0d45255d7db41378fefd6671f9a1be6644b",
      "exitCode": 1,
      "finishedAt": "2023-09-20T16:53:35Z",
      "reason": "Error",
      "startedAt": "2023-09-20T16:53:26Z"
    }
  },
  "ready": true,
  "restartCount": 1
}

$ curl localhost:32660
Hello World

$ curl localhost:32660/oom
curl: (52) Empty reply from server

$ kubectl get pod my-pod -o=jsonpath='{.status}' | jq '.containerStatuses[] | {state, lastState, ready, restartCount}'
{
  "state": {
    "running": {
      "startedAt": "2023-09-20T16:55:23Z"
    }
  },
  "lastState": {
    "terminated": {
      "containerID": "docker://99106019e35dfaf4ad2f1ede6653c686de9a082f229f9d24667793b0712e35f2",
      "exitCode": 137,
      "finishedAt": "2023-09-20T16:55:22Z",
      "reason": "OOMKilled",
      "startedAt": "2023-09-20T16:55:16Z"
    }
  },
  "ready": true,
  "restartCount": 1
}

$ curl localhost:32660
Hello World

終了時の状態がlastStateとなり、stateはRunningになっている。そしてrestartCountがインクリメントされている。
readyがtrueなのでlocalhost:32660へのリクエストを正しく処理できている。

だが/exit-0では再起動はされない。これは、終了コードが0、つまり正常終了であるためである。

$ curl localhost:32660/exit-0
Exit by 0

$ kubectl get pod my-pod -o=jsonpath='{.status}' | jq '.containerStatuses[] | {state, lastState, ready, restartCount}'
{
  "state": {
    "terminated": {
      "containerID": "docker://f2bf56baff1f1a5a9a3ce082612aefbc3ffdcefabfe16f1bc5162a8f213df5ae",
      "exitCode": 0,
      "finishedAt": "2023-09-20T16:56:49Z",
      "reason": "Completed",
      "startedAt": "2023-09-20T16:56:36Z"
    }
  },
  "lastState": {},
  "ready": false,
  "restartCount": 0
}

$ curl localhost:32660
curl: (52) Empty reply from server

コンテナが終了しているので、当然localhost:32660にリクエストを送ってもレスポンスは返ってこない。

Alwaysにすると、異常終了に加えて正常終了のときも再起動するようになる。

@@ -14,7 +14,7 @@ spec:
     resources:
       limits:
         memory: 256Mi
-  restartPolicy: OnFailure # Restart Policy
+  restartPolicy: Always # Restart Policy
 ---
 apiVersion: v1
 kind: Service

$ curl localhost:32660/exit-0
Exit by 0

$ kubectl get pod my-pod -o=jsonpath='{.status}' | jq '.containerStatuses[] | {state, lastState, ready, restartCount}'
{
  "state": {
    "running": {
      "startedAt": "2023-09-20T17:06:13Z"
    }
  },
  "lastState": {
    "terminated": {
      "containerID": "docker://196f08053ddaaf4381359258e8606bc60319a90eca64ad42af463f905dd250b2",
      "exitCode": 0,
      "finishedAt": "2023-09-20T17:06:12Z",
      "reason": "Completed",
      "startedAt": "2023-09-20T17:06:10Z"
    }
  },
  "ready": true,
  "restartCount": 1
}

$ curl localhost:32660
Hello World

コンテナが常に稼働していることを想定している（コンテナが役目を終えて終了することを想定していない）場合、Alwaysにしておけばよいはず。
そうすれば、何らかの理由でコンテナが終了してしまっても、Kubernetes が再起動してくれる。

しかし状況によっては、終了していないコンテナも再起動したいことがある。例えば、バグ等により正常に動作しなくなってしまったコンテナに対しては、そのままにしておくのではなく再起動させたいかもしれない。
それにコンテナを再起動させた場合も、すぐにリクエストを受け付けられる状態になるとは限らない。そのようなコンテナに対しては、準備が整うまでリクエストをルーティングしたくないはず。
同様に、（巨大なファイルを読み込んでいるなどの理由で）一時的にリクエストに応答できなくなったコンテナに対してもルーティングしたくないが、いずれ復帰するので必ずしも再起動させたいわけではない。

Restart Policy だけではこれらのニーズに応えることは難しいが、Probe と組み合わせることで解決できる。

Probe

Probe とは、Kubernetes がコンテナに対して行う診断のこと。
定期的に診断を実行し、問題があれば必要な対応も自動的に行ってくれる。

複数の診断方法が用意されているが、今回はコンテナに HTTP GET リクエストを送る方式を使うことにする。

Probe には Liveness Probe、Startup Probe、Readiness Probe の 3 種類があり、目的によって使い分ける。

Liveness Probe

Liveness Probe は、コンテナが正常に稼働しているかを診断する。
診断の結果、「終了こそしていないが正常に稼働していない」と判断された場合、Kubernetes はそのコンテナを終了させる。
注意しなければならないのは、あくまでも終了させるだけだということ。再起動するかどうかは Restart Policy によって決まる。
Restart Policy がOnFailureかAlwaysなら再起動するが、Neverでは再起動せず終了したままになる。

Probe も、マニフェストファイルに書き足すことで設定できる。また、動作確認の都合上、Restart Policy はNeverにしておく。

@@ -14,7 +14,13 @@ spec:
     resources:
       limits:
         memory: 256Mi
-  restartPolicy: Always # Restart Policy
+    livenessProbe:
+      httpGet:
+        path: /probe
+        port: 3000
+      periodSeconds: 5
+      failureThreshold: 3
+  restartPolicy: Never # Restart Policy
 ---
 apiVersion: v1
 kind: Service

httpGetは、指定したポート番号、パスでコンテナに HTTP GET リクエストを送り、レスポンスのステータスコードが 200 ~ 399 なら「コンテナが正常に稼働している」と見做す、という診断方法。
今回は/probeというパスにリクエストを送る。

periodSecondsは Probe を実行する頻度を秒数で指定する。なのでこの例では5秒毎にコンテナにリクエストを送る。

failureThresholdはリトライ回数で、この回数まで Probe を試みる。今回は3を指定しているので、3回連続で Probe に失敗すると、「コンテナが正常に稼働していない」と見做され、コンテナは終了させられる。

他にも設定項目があるので、詳細は公式ドキュメントを参照。各項目のデフォルト値や制限なども書かれている。
Liveness Probe、Readiness ProbeおよびStartup Probeを使用する | Kubernetes

コンテナでは以下のコードを動かす。

import http from "http";

const startTime = performance.now();
let isEnable = true;

http
  .createServer(function ({ url }, res) {
    switch (url) {
      case "/": {
        res.writeHead(200, { "Content-Type": "text/plain" });
        res.end("Hello World\n");
        break;
      }
      case "/probe": {
        if (isEnable) {
          res.writeHead(200, { "Content-Type": "text/plain" });
          res.end("Success\n");
          console.log(
            `Probe is success. ${Math.floor(
              (performance.now() - startTime) / 1000
            )} seconds have passed since the process started.`
          );
        } else {
          res.writeHead(500, { "Content-Type": "text/plain" });
          res.end("Failure\n");
          console.log(
            `Probe is failure. ${Math.floor(
              (performance.now() - startTime) / 1000
            )} seconds have passed since the process started.`
          );
        }
        break;
      }
      case "/enable": {
        isEnable = true;
        res.writeHead(200, { "Content-Type": "text/plain" });
        res.end("Enable probe path\n");
        console.log("Enabled");
        break;
      }
      case "/disable": {
        isEnable = false;
        res.writeHead(200, { "Content-Type": "text/plain" });
        res.end("Disable probe path\n");
        console.log("Disabled");
        break;
      }
      default: {
        res.writeHead(404, { "Content-Type": "text/plain" });
        res.end("Not Found\n");
        break;
      }
    }
  })
  .listen(3000);

Probe からのリクエストを受け付ける/probeというパスを用意した。
初期状態では/probeは200を返すが、/disableにリクエストを送るとそれ以降、/probeは500を返すようになる。/enableにリクエストを送ると、それ以降の/probeへのリクエストは200を返すようになる。

上記コードを動かすコンテナのイメージをsampleとしてビルドした上で、apply を行う。

$ kubectl apply -f manifestfile.yaml
pod/my-pod created
service/my-ser created

これで既に Liveness Probe が実行されているはずなので、ログを見てみる。Pod のログは$ kubectl logs Podの名前で見れる。

$ kubectl logs my-pod
yarn run v1.22.19
$ ts-node-dev index.ts
[INFO] 05:02:34 ts-node-dev ver. 2.0.0 (using ts-node ver. 10.9.1, typescript ver. 5.2.2)
Probe is success. 3 seconds have passed since the process started.
Probe is success. 8 seconds have passed since the process started.
Probe is success. 13 seconds have passed since the process started.
Probe is success. 18 seconds have passed since the process started.

5秒毎に/probeへのリクエストが発生していることが分かる。

/disableへリクエストを送った数秒後に/enableにリクエストを送ってみる。

$ curl localhost:32660/disable
Disable probe path

$ curl localhost:32660/enable
Enable probe path

再びログを見てみる。

$ kubectl logs my-pod
yarn run v1.22.19
$ ts-node-dev index.ts
[INFO] 05:02:34 ts-node-dev ver. 2.0.0 (using ts-node ver. 10.9.1, typescript ver. 5.2.2)
Probe is success. 3 seconds have passed since the process started.
Probe is success. 8 seconds have passed since the process started.
Probe is success. 13 seconds have passed since the process started.
Probe is success. 18 seconds have passed since the process started.
Disabled
Probe is failure. 23 seconds have passed since the process started.
Probe is failure. 28 seconds have passed since the process started.
Enabled
Probe is success. 33 seconds have passed since the process started.
Probe is success. 38 seconds have passed since the process started.

2回連続で失敗しているが、3回目で成功したため、コンテナは終了することなく稼働し続けている。そして当然、Probe はその後も行われる。

もう一度/disableにリクエストを送り、今度はそのままにしてみる。

$ curl localhost:32660/disable
Disable probe path

$ kubectl logs my-pod
yarn run v1.22.19
$ ts-node-dev index.ts
[INFO] 05:02:34 ts-node-dev ver. 2.0.0 (using ts-node ver. 10.9.1, typescript ver. 5.2.2)
Probe is success. 3 seconds have passed since the process started.
Probe is success. 8 seconds have passed since the process started.
Probe is success. 13 seconds have passed since the process started.
Probe is success. 18 seconds have passed since the process started.
Disabled
Probe is failure. 23 seconds have passed since the process started.
Probe is failure. 28 seconds have passed since the process started.
Enabled
Probe is success. 33 seconds have passed since the process started.
Probe is success. 38 seconds have passed since the process started.
Probe is success. 43 seconds have passed since the process started.
Disabled
Probe is failure. 48 seconds have passed since the process started.
Probe is failure. 53 seconds have passed since the process started.
Probe is failure. 58 seconds have passed since the process started.

3回連続で失敗したため、このコンテナは Kubernetes によって終了させられた。

$ kubectl get pod my-pod -o=jsonpath='{.status}' | jq '.containerStatuses[] | {state, lastState, ready, restartCount}'
{
  "state": {
    "terminated": {
      "containerID": "docker://5d798fd0a665c1b2dbd008f55cfb1cd132a1f82b1f5d9c6cc7e862eb40c6a8f7",
      "exitCode": 1,
      "finishedAt": "2023-09-23T05:03:34Z",
      "reason": "Error",
      "startedAt": "2023-09-23T05:02:34Z"
    }
  },
  "lastState": {},
  "ready": false,
  "restartCount": 0
}

終了コードが1なので、既述の通り Restart Policy がAlwaysかOnFailureならコンテナは再起動する。

Startup Probe

Liveness Probe を使うことでコンテナが正常に稼働しているかチェックできる。
しかし、初期化処理に時間が掛かり、Liveness Probe に応答できるようになるまでに時間が掛かるコンテナの場合は、どうしたらよいだろうか。
例えば先程の例では5秒毎に Liveness Probe を実行していたが、初期化処理に30秒から60秒ほど掛かる場合、Liveness Probe は必ず失敗し、コンテナが終了してしまう。再起動させたところでまた、コンテナの準備が整う前に Liveness Probe が実行され、それに失敗して再びコンテナは終了してしまう。
設定によって Liveness Probe の開始を遅らせることもできるが、その場合、一体何秒遅らせればよいのだろうか。余裕を持って90秒くらいにしておけば、終了と再起動のループに陥ることはないだろう。しかし30秒程度で準備が整うこともあり、その場合はコンテナの準備が整い次第すぐに Liveness Probe を始めたい。

Liveness Probe に Startup Probe を組み合わせることで、上記のような課題を解決できる。

Startup Probe は、コンテナの起動が正常に完了したかを診断する。
Liveness Probe と同様、failureThresholdに指定した回数だけ連続で失敗すると、コンテナは終了する。終了コードは1なので、Restart Policy がAlwaysかOnFailureならコンテナは再起動する。

Startup Probe は Liveness Probe とは違い、一度成功すればそれ以降は実行されない。
そしてこれが重要な点だが、Startup Probe が成功するまでは、他の Probe （Liveness Probe や、後述する Readiness Probe）は実行されなくなる。
つまり、コンテナが起動を開始した直後は Startup Probe によって診断を行い、それが成功した後は Liveness Probe によって継続的な診断を行う、ということが可能になる。

先程のマニフェストファイルに Startup Probe の記述を追加して、試してみる。

@@ -20,6 +20,12 @@ spec:
         port: 3000
       periodSeconds: 5
       failureThreshold: 3
+    startupProbe:
+      httpGet:
+        path: /probe
+        port: 3000
+      periodSeconds: 15
+      failureThreshold: 6
   restartPolicy: Never # Restart Policy
 ---
 apiVersion: v1

15秒間隔で実行し、6回連続で失敗したらコンテナを終了させるようにしている。
つまり、起動開始から90秒の猶予がある。それまでに Startup Probe が成功しなかった場合、「コンテナの起動を完了させることができなかった」と見做し、Kubernetes によってコンテナは終了させられる。

Startup Probe に成功した場合、それ以降は Startup Probe は実行されなくなり、Liveness Probe の実行が開始される。
Liveness Probe の設定は変えていないので、先程と同様5秒間隔で継続的に実行される。

コンテナで動かすコードは以下。

import http from "http";

const startTime = performance.now();

let isEnable = false;
setTimeout(() => {
  isEnable = true;
}, 30 * 1000);

http
  .createServer(function ({ url }, res) {
    if (!isEnable) {
      res.writeHead(500, { "Content-Type": "text/plain" });
      res.end("Failure\n");
      console.log(
        `Probe is failure. ${Math.floor(
          (performance.now() - startTime) / 1000
        )} seconds have passed since the process started.`
      );
      return;
    }

    switch (url) {
      case "/": {
        res.writeHead(200, { "Content-Type": "text/plain" });
        res.end("Hello World\n");
        break;
      }
      case "/probe": {
        res.writeHead(200, { "Content-Type": "text/plain" });
        res.end("Success\n");
        console.log(
          `Probe is success. ${Math.floor(
            (performance.now() - startTime) / 1000
          )} seconds have passed since the process started.`
        );
        break;
      }
      default: {
        res.writeHead(404, { "Content-Type": "text/plain" });
        res.end("Not Found\n");
        break;
      }
    }
  })
  .listen(3000);

初期状態だと全てのリクエストに対してステータスコード500を返すようになっている。
そして30秒経過すると/と/probeへのリクエストに対してステータスコード200を返すようになる。

上記コードを動かすコンテナイメージをビルドして、apply する。

コンテナのステータスを確認してみると、Runningではあるのだが、readyがfalseになっている。つまり、コンテナは動作しているものの、リクエストを受け付けられる状態ではないという扱いになっている。

$ kubectl get pod my-pod -o=jsonpath='{.status}' | jq '.containerStatuses[] | {state, lastState, ready, restartCount}'
{
  "state": {
    "running": {
      "startedAt": "2023-09-23T07:01:42Z"
    }
  },
  "lastState": {},
  "ready": false,
  "restartCount": 0
}

コンテナが Ready ではない場合、そのコンテナを管理している Pod は Service の Endpoints から外される。つまり、この Pod にリクエストがルーティングされることはなくなる。

$ kubectl describe endpoints サービスの名前で Endpoints の詳細を確認できるので、見てみる。

$ kubectl describe endpoints my-ser
Name:         my-ser
Namespace:    default
Labels:       <none>
Annotations:  <none>
Subsets:
  Addresses:          <none>
  NotReadyAddresses:  10.1.1.1
  Ports:
    Name         Port  Protocol
    ----         ----  --------
    my-ser-port  3000  TCP

Events:  <none>

my-podの IP アドレス（10.1.1.1）はNotReadyAddressesになっている。
今回の例では他に Pod がないので、my-serがルーティングできる Pod はひとつもない。

$ kubectl get endpoints my-ser
NAME     ENDPOINTS   AGE
my-ser               11s

なので、クラスタの外からリクエストを送ってもレスポンスを得られない（500エラーを得ることもできない）。

$ curl localhost:32660
curl: (52) Empty reply from server

Endpoints については以下の記事に書いている。

numb86-tech.hatenablog.com

30秒経過すると成功するはずなので確認してみる。

$ kubectl get pod my-pod -o=jsonpath='{.status}' | jq '.containerStatuses[] | {state, lastState, ready, restartCount}'
{
  "state": {
    "running": {
      "startedAt": "2023-09-23T07:01:42Z"
    }
  },
  "lastState": {},
  "ready": true,
  "restartCount": 0
}

$ kubectl describe endpoints my-ser
Name:         my-ser
Namespace:    default
Labels:       <none>
Annotations:  endpoints.kubernetes.io/last-change-trigger-time: 2023-09-23T07:02:27Z
Subsets:
  Addresses:          10.1.1.1
  NotReadyAddresses:  <none>
  Ports:
    Name         Port  Protocol
    ----         ----  --------
    my-ser-port  3000  TCP

Events:  <none>

$ kubectl get endpoints my-ser
NAME     ENDPOINTS       AGE
my-ser   10.1.1.1:3000   58s

Ready になっており、Endpoints に追加されている。

これで、クラスタの外からのリクエストに対応できるようになった。

$ curl localhost:32660
Hello World

Pod のログを確認してみる。

$ kubectl logs my-pod
yarn run v1.22.19
$ ts-node-dev index.ts
[INFO] 07:01:43 ts-node-dev ver. 2.0.0 (using ts-node ver. 10.9.1, typescript ver. 5.2.2)
Probe is failure. 13 seconds have passed since the process started.
Probe is failure. 28 seconds have passed since the process started.
Probe is success. 43 seconds have passed since the process started.
Probe is success. 48 seconds have passed since the process started.
Probe is success. 53 seconds have passed since the process started.
Probe is success. 58 seconds have passed since the process started.
Probe is success. 63 seconds have passed since the process started.
Probe is success. 68 seconds have passed since the process started.

最初の3回の Probe は、Startup Probe によるもの。15秒毎に実行されている。
そして3回目で成功するので、それ以降は Startup Probe は行われず、今度は Liveness Probe が実行されるようになる。
4回目以降の Probe が Liveness Probe だが、5秒毎に実行されていることを確認できる。

Readiness Probe

Startup Probe と Liveness Probe で、コンテナの起動は正常に完了したか、コンテナは正常に稼働しているかを、診断できるようになった。
だが、正常に稼働しているコンテナであっても、負荷が強くなるなどの理由で、一時的にリクエストに応答できなくなることはあり得る。
このような、「コンテナを再起動させたいわけではないがリクエストは受け付けられない」という状態を検知するための Probe が、Readiness Probe である。

Readiness Probe はfailureThresholdで指定した回数連続で失敗すると、そのコンテナを管理している Pod が Service の Endpoints から外される。
その後も Readiness Probe は定期的に実行されており、成功すると Endpoints に加えられ、再びその Pod に対してリクエストがルーティングされるようになる。

以下が Readiness Probe の設定の例。今回は Liveness Probe と Startup Probe を外し、診断を毎秒実行、1回でも失敗すればルーティングしないようにしている。

@@ -14,18 +14,12 @@ spec:
     resources:
       limits:
         memory: 256Mi
-    livenessProbe:
+    readinessProbe:
       httpGet:
         path: /probe
         port: 3000
-      periodSeconds: 5
-      failureThreshold: 3
-    startupProbe:
-      httpGet:
-        path: /probe
-        port: 3000
-      periodSeconds: 15
-      failureThreshold: 6
+      periodSeconds: 1
+      failureThreshold: 1
   restartPolicy: Never # Restart Policy
 ---
 apiVersion: v1

コンテナでは以下のコードを動かす。/heavyにリクエストすると10秒間処理が停止し、その間はあらゆるリクエストに応答できなくなる。

import http from "http";

const startTime = performance.now();

function sleep(ms: number) {
  const startTime = performance.now();
  while (performance.now() - startTime < ms);
}

http
  .createServer(function ({ url }, res) {
    switch (url) {
      case "/": {
        res.writeHead(200, { "Content-Type": "text/plain" });
        res.end("Hello World\n");
        break;
      }
      case "/probe": {
        res.writeHead(200, { "Content-Type": "text/plain" });
        res.end("Success\n");
        console.log(
          `Probe is success. ${Math.floor(
            (performance.now() - startTime) / 1000
          )} seconds have passed since the process started.`
        );
        break;
      }
      case "/heavy": {
        console.log(
          `Start heavy process. ${Math.floor(
            (performance.now() - startTime) / 1000
          )} seconds have passed since the process started.`
        );
        sleep(10 * 1000);
        res.writeHead(200, { "Content-Type": "text/plain" });
        res.end("Heavy path\n");
        console.log(
          `Finished heavy process. ${Math.floor(
            (performance.now() - startTime) / 1000
          )} seconds have passed since the process started.`
        );
        break;
      }
      default: {
        res.writeHead(404, { "Content-Type": "text/plain" });
        res.end("Not Found\n");
        break;
      }
    }
  })
  .listen(3000);

今回もコンテナイメージをビルドして apply する。

ステータスを確認してみると、Runningであり、Ready である。問題なく稼働している。

$ kubectl get pod my-pod -o=jsonpath='{.status}' | jq '.containerStatuses[] | {state, lastState, ready, restartCount}'
{
  "state": {
    "running": {
      "startedAt": "2023-09-23T10:00:32Z"
    }
  },
  "lastState": {},
  "ready": true,
  "restartCount": 0
}

$ kubectl get endpoints my-ser
NAME     ENDPOINTS       AGE
my-ser   10.1.1.7:3000   6s

/heavyにリクエストを送り、コンテナがリクエストに応答できない状態にしてみる。

$ curl localhost:32660/heavy
^C

Ctrl + cですぐに処理を中断したあと、再びコンテナや Endpoints の状態を確認してみる。

$ kubectl get pod my-pod -o=jsonpath='{.status}' | jq '.containerStatuses[] | {state, lastState, ready, restartCount}'
{
  "state": {
    "running": {
      "startedAt": "2023-09-23T10:00:32Z"
    }
  },
  "lastState": {},
  "ready": false,
  "restartCount": 0
}

$ kubectl get endpoints my-ser
NAME     ENDPOINTS   AGE
my-ser               20s

Runningではあるものの Ready ではないと見做され、Endpoints から Pod の IP アドレスが外れている。

10秒経過すると/heavyの処理が終わり、再びリクエストを受け付けられる状態になる。

$ kubectl get pod my-pod -o=jsonpath='{.status}' | jq '.containerStatuses[] | {state, lastState, ready, restartCount}'
{
  "state": {
    "running": {
      "startedAt": "2023-09-23T10:00:32Z"
    }
  },
  "lastState": {},
  "ready": true,
  "restartCount": 0
}

$ kubectl get endpoints my-ser
NAME     ENDPOINTS       AGE
my-ser   10.1.1.7:3000   28s

Pod のログを見てみると、Probe が毎秒実行されていることが分かる。
そして/heavyの処理が始まると（Probe からの）後続のリクエストが捌かれずに滞留し、10秒経過後に溜まっていたリクエストが一気に処理されている。
この10秒間が、Ready ではなかった期間となる。

$ kubectl logs my-pod
yarn run v1.22.19
$ ts-node-dev index.ts
[INFO] 10:00:33 ts-node-dev ver. 2.0.0 (using ts-node ver. 10.9.1, typescript ver. 5.2.2)
Probe is success. 0 seconds have passed since the process started.
Probe is success. 0 seconds have passed since the process started.
Probe is success. 1 seconds have passed since the process started.
Probe is success. 2 seconds have passed since the process started.
Probe is success. 3 seconds have passed since the process started.
Probe is success. 4 seconds have passed since the process started.
Probe is success. 5 seconds have passed since the process started.
Probe is success. 6 seconds have passed since the process started.
Probe is success. 7 seconds have passed since the process started.
Probe is success. 8 seconds have passed since the process started.
Start heavy process. 9 seconds have passed since the process started.
Finished heavy process. 19 seconds have passed since the process started.
Probe is success. 19 seconds have passed since the process started.
Probe is success. 19 seconds have passed since the process started.
Probe is success. 19 seconds have passed since the process started.
Probe is success. 19 seconds have passed since the process started.
Probe is success. 19 seconds have passed since the process started.
Probe is success. 19 seconds have passed since the process started.
Probe is success. 19 seconds have passed since the process started.
Probe is success. 19 seconds have passed since the process started.
Probe is success. 19 seconds have passed since the process started.
Probe is success. 19 seconds have passed since the process started.
Probe is success. 19 seconds have passed since the process started.
Probe is success. 20 seconds have passed since the process started.
Probe is success. 21 seconds have passed since the process started.
Probe is success. 22 seconds have passed since the process started.
Probe is success. 23 seconds have passed since the process started.

Readiness Probe を適切に設定することで、システムを安定的に稼働させることができる。

例えば、今回と同じコードを 2 つの Pod で動かすケースを想定してみる。

まず Readiness Probe を設定しない場合。

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-dep
spec:
  selector:
    matchLabels:
      app: node-app
  replicas: 2
  template:
    metadata:
      name: my-pod
      labels:
        app: node-app
    spec:
      containers:
      - name: my-container
        image: sample:latest
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 3000
        resources:
          limits:
            memory: 256Mi
---
apiVersion: v1
kind: Service
metadata:
  name: my-ser
spec:
  type: NodePort
  ports:
  - name: my-ser-port
    port: 8099
    targetPort: 3000
    nodePort: 32660
  selector:
    app: node-app

この内容でクラスタを動かしているときに、req.shという名前の以下のシェルスクリプトを実行する。

#!/bin/zsh

# localhost:32660/heavy にリクエストを送るが、結果を待たずに次へ進む
curl -s localhost:32660/heavy > /dev/null 2>&1 &

# 2 秒間待機
sleep 2

# 200回のリクエストを実行
for i in {1..200}; do
    echo $i
done | xargs -n 1 -P 10 -I {} sh -c 'curl -s --max-time 0.1 localhost:32660 > /dev/null 2>&1 && echo success || echo failure' >> results.txt

# 成功と失敗の回数をカウント
success_count=$(grep -c "success" results.txt)
failure_count=$(grep -c "failure" results.txt)

# 結果の表示
echo "成功した回数: $success_count"
echo "失敗した回数: $failure_count"

# results.txt ファイルを削除
rm results.txt

/heavyにリクエストを送り、その2秒後に/に対して200回リクエストを送る。/へのリクエストは0.1秒でタイムアウトするようにしている。
そして最後に、200回のうち何回成功し何回失敗したかを表示する。

実行結果は以下のようになる。

$ ./req.sh
成功した回数: 102
失敗した回数: 98

数字は多少前後するが、概ねこれくらいの結果になる。
Pod が 2 つあるが、そのうちのひとつが/heavyの処理によって0.1秒以内にリクエストに応答できない状態になっているため、大体半分くらいのリクエストが失敗する。

Readiness Probe を設定することで、この問題を解決できる。

@@ -22,6 +22,12 @@ spec:
         resources:
           limits:
             memory: 256Mi
+        readinessProbe:
+          httpGet:
+            path: /probe
+            port: 3000
+          periodSeconds: 1
+          failureThreshold: 1
 ---
 apiVersion: v1
 kind: Service

先程のシェルスクリプトを再び実行すると、今度は全て成功するようになる。

$ ./req.sh
成功した回数: 200
失敗した回数: 0

/heavyのリクエストを受け付けた Pod は Readiness Probe が失敗するため、リクエストがルーティングされなくなる。
その結果、リクエストにすぐに応答できる状態のもう一方の Pod に、全てのリクエストがルーティングされるようになる。
そのため、全てのリクエストが成功するのである。