I think I’ve found a reason why the local rack is so unreliable for me. The convox api service seems to become unhealthy and get restarted a lot:
$ kubectl describe pod/api-767f6d7b9-8j84k -n convox
Name: api-767f6d7b9-8j84k
Namespace: convox
Priority: 0
PriorityClassName: <none>
Node: docker-desktop/192.168.65.3
Start Time: Thu, 31 Jan 2019 10:22:33 +0100
Labels: app=system
pod-template-hash=767f6d7b9
rack=convox
service=api
system=convox
Annotations: scheduler.alpha.kubernetes.io/critical-pod:
Status: Running
IP: 10.1.0.6
Controlled By: ReplicaSet/api-767f6d7b9
Containers:
main:
Container ID: docker://70cc356c01b8cb6900cbdb22ab559e3831dfebc09248fbbf4083fad14e3b0751
Image: convox/rack:20190130162938
Image ID: docker-pullable://convox/rack@sha256:98ea788786614da93cd39efea2750040593513066fcff267d07a1854ca590a00
Port: 5443/TCP
Host Port: 0/TCP
Args:
rack
State: Running
Started: Thu, 31 Jan 2019 14:25:47 +0100
Last State: Terminated
Reason: Error
Exit Code: 143
Started: Thu, 31 Jan 2019 14:21:24 +0100
Finished: Thu, 31 Jan 2019 14:25:44 +0100
Ready: True
Restart Count: 15
Liveness: http-get https://:5443/check delay=5s timeout=3s period=5s #success=1 #failure=3
Readiness: http-get https://:5443/check delay=0s timeout=3s period=5s #success=1 #failure=3
Environment Variables from:
env-api ConfigMap Optional: false
Environment:
DATA: /data
DEVELOPMENT: false
ID: 5e63eec1-eedc-42e7-8aac-57fd68a47965
IMAGE: convox/rack:20190130162938
RACK: convox
VERSION: 20190130162938
Mounts:
/data from data (rw)
/var/run/docker.sock from docker (rw)
/var/run/secrets/kubernetes.io/serviceaccount from rack-token-8xgpl (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
data:
Type: HostPath (bare host directory volume)
Path: /var/rack/convox
HostPathType: DirectoryOrCreate
docker:
Type: HostPath (bare host directory volume)
Path: /var/run/docker.sock
HostPathType:
rack-token-8xgpl:
Type: Secret (a volume populated by a Secret)
SecretName: rack-token-8xgpl
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Pulled 7m17s (x16 over 4h6m) kubelet, docker-desktop Container image "convox/rack:20190130162938" already present on machine
Normal Killing 7m17s (x11 over 3h5m) kubelet, docker-desktop Killing container with id docker://main:Container failed liveness probe.. Container will be killed and recreated.
Normal Created 7m16s (x15 over 4h6m) kubelet, docker-desktop Created container
Normal Started 7m15s (x15 over 4h6m) kubelet, docker-desktop Started container
Warning Unhealthy 7m7s (x27 over 4h5m) kubelet, docker-desktop Liveness probe failed: Get https://10.1.0.6:5443/check: dial tcp 10.1.0.6:5443: connect: connection refused
Warning Unhealthy 7m6s (x36 over 4h5m) kubelet, docker-desktop Readiness probe failed: Get https://10.1.0.6:5443/check: dial tcp 10.1.0.6:5443: connect: connection refused
Warning Unhealthy 3m5s (x24 over 3h38m) kubelet, docker-desktop Liveness probe failed: Get https://10.1.0.6:5443/check: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Warning Unhealthy 3m3s (x30 over 3h38m) kubelet, docker-desktop Readiness probe failed: Get https://10.1.0.6:5443/check: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Next up is to figure out why. The logs I get with kubectl logs deployments/api -n convox
does not give much information, it just abruptly ends and starts without any errors.
Would it help to scale the api out to multiple pods?