Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Question
Wednesday, June 26, 2019 8:14 PM
I ran helm install to install an app that is in my Azure Kubernetes Registry. One of the pods got stuck in a pending state and when I describe it, I see the warning below;
Events:
Type Reason Age From Message
Warning FailedScheduling 24s (x18 over 9m39s) default-scheduler 0/2 nodes are available: 2 node(s) didn't match node selector.
In more detail, these are the events that took place:
<vasanth_venkatachalam@Azure:~/clouddrive/.cloudconsole$> kubectl get pods
NAME READY STATUS RESTARTS AGE
cautious-magpie-cmdb-mariadb-0 0/1 Pending 0 6m12s
cautious-magpie-cmdb-post-install-nvxvt 1/1 Running 0 6m12s
<vasanth_venkatachalam@Azure:~/clouddrive/.cloudconsole$>
kubectl describe pod cautious-magpie-cmdb-mariadb-0
Name: cautious-magpie-cmdb-mariadb-0
Namespace: default
Priority: 0
PriorityClassName: <none>
Node: <none>
Labels: app=cautious-magpie-cmdb
cmdb-dbtype=mariadb
controller-revision-hash=cautious-magpie-cmdb-mariadb-6bc7df696d
csf-component=cmdb
csf-subcomponent=mariadb
heritage=Tiller
release=cautious-magpie
statefulset.kubernetes.io/pod-name=cautious-magpie-cmdb-mariadb-0
type=mariadb
Annotations: <none>
Status: Pending
IP:
Controlled By: StatefulSet/cautious-magpie-cmdb-mariadb
Init Containers:
mariadbinit-user-config:
Image: impactcontainerregistry.azurecr.io/cmdb/mariadb:4.8-2.1315
Port: <none>
Host Port: <none>
Command:
bash
-c
cp /import-cm/mysqld.site /import/
cp /import-users/database_users.json /import/ 2>/dev/null | true
sed -i -e '$a\ $(ls -d /import/* | grep -v db.d) | true
Limits:
cpu: 100m
memory: 64Mi
Requests:
cpu: 100m
memory: 64Mi
Environment: <none>
Mounts:
/import from import (rw)
/import-cm from import-cm (rw)
/import-users from import-users (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-vvwbw (ro)
Containers:
mariadb:
Image: impactcontainerregistry.azurecr.io/cmdb/mariadb:4.8-2.1315
Ports: 3306/TCP, 4444/TCP, 4567/TCP, 4568/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP
Limits:
cpu: 1
memory: 768Mi
Requests:
cpu: 250m
memory: 256Mi
Liveness: exec [bash -c /usr/bin/mariadb_db --verify-access
] delay=300s timeout=5s period=10s #success=1 #failure=6
Readiness: exec [bash -c /usr/bin/mariadb_db --verify-access
Liveness: exec [bash -c /usr/bin/mariadb_db --verify-access
] delay=300s timeout=5s period=10s #success=1 #failure=6
Readiness: exec [bash -c /usr/bin/mariadb_db --verify-access
] delay=10s timeout=1s period=15s #success=1 #failure=3
Environment:
CLUSTER_TYPE: simplex
CLUSTER_NAME: cautious-magpie
REQUIRE_USERS_JSON: yes
Mounts:
/chart from cluster-cm (rw)
/import from import (rw)
/import/db.d from importdb (rw)
/mariadb from datadir (rw)
/mariadb/backup from backupdir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-vvwbw (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
datadir:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: datadir-cautious-magpie-cmdb-mariadb-0
ReadOnly: false
backupdir:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: backupdir-cautious-magpie-cmdb-mariadb-0
ReadOnly: false
cluster-cm:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: cautious-magpie-cmdb-mariadb-cluster
Optional: false
import:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
import-cm:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: cautious-magpie-cmdb-mariadb-config
Optional: false
import-users:
Type: Secret (a volume populated by a Secret)
SecretName: cautious-magpie-cmdb-mariadb-initialusers
Optional: true
importdb:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: cautious-magpie-cmdb-mariadb-databases
Optional: false
default-token-vvwbw:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-vvwbw
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
Warning FailedScheduling 35s (x18 over 7m) default-scheduler 0/2 nodes are available: 2 node(s) didn't match node selector.
What does this "FailedScheduling 2node(s) didn't match node selector even mean and how can I get around that?
All replies (5)
Thursday, June 27, 2019 9:51 PM ✅Answered | 1 vote
Update. I got past the error. The problem was that the values.yaml file also had nodeAffinity settings, and the key value pairs it was specifying also had to be changed.
nodeAffinity:
enabled: true
key: beta.kubernetes.io/os
value: Linux
Thursday, June 27, 2019 4:54 AM | 1 vote
Hi Vasanth,
Looks like your nodes are not in Ready state.
Execute "Kubectl get nodes" and check if the nodes are in ready state or not.
If its not in Ready state, Then describe the node and check the events. Common reason would be the stopped nodes.
If its in ready state, Then we need to else where.
From the pods description, You dont have a node selector.
I see a state full set in the label. Are you using s state full set?
Thursday, June 27, 2019 2:55 PM
The nodes are in ready state when this happens:
<Cvasanth_venkatachalam@Azure:~/clouddrive/.cloudconsole$> kubectl get nodes
NAME STATUS ROLES AGE VERSION
aks-agentpool-30689406-1 Ready agent 8d v1.13.5
aks-agentpool-30689406-2 Ready agent 6d18h v1.13.5
<vasanth_venkatachalam@Azure:~/clouddrive/.cloudconsole$>
Someone suggested I should do this:
The
nodeSelector is missing “beta.kubernetes.io/os: linux”. Please correct the
deployment by including this information. To correct the current pod, you can
run the following command:<o:p></o:p>
kubectl
patch pod <pod name> -p
'{"spec":{"template":{"spec":{"nodeSelector":{"beta.kubernetes.io/os":
"linux"}}}}}'<o:p></o:p>
I first tried patching the pod that was stuck in pending state by running the above command:
kubectl patch pod exacerbated-dachshund-cmdb-mariadb-0 -p '{"spec":{"template":{"spec":{"nodeSelector":
{"beta.kubernetes.io/os": "linux"}}}}}'
pod/exacerbated-dachshund-cmdb-mariadb-0 patched (no change)
I then restarted that pod with:
<vasanth_venkatachalam@Azure:~/clouddrive/.cloudconsole$> kubectl delete pod exacerbated-dachshund-cmdb-mariadb-0
pod "exacerbated-dachshund-cmdb-mariadb-0" deleted
But the pod stays stuck in pending state:
<vasanth_venkatachalam@Azure:~/clouddrive/.cloudconsole$> kubectl get pods
NAME READY STATUS RESTARTS AGE
exacerbated-dachshund-cmdb-mariadb-0 0/1 Pending 0 10s
exacerbated-dachshund-cmdb-post-install-nnzf5 0/1 Completed 0 18h
And I see the same error in the events log:
3m51s Warning FailedScheduling Pod 0/2 nodes are available: 2 node(s) didn't match node selector.
58s Warning FailedScheduling Pod 0/2 nodes are available: 2 node(s) didn't match node selector.
58s Normal SuccessfulCreate StatefulSet create Pod exacerbated-dachshund-cmdb-mariadb-0 in StatefulSet exacerbated-dachshund-cmdb-mariadb successful
0s Warning FailedScheduling Pod 0/2 nodes are available: 2 node(s) didn't match node selector.
0s Warning FailedScheduling Pod 0/2 nodes are available: 2 node(s) didn't match node selector.
0s Warning FailedScheduling Pod 0/2 nodes are available: 2 node(s) didn't match node selector.
0s Warning FailedScheduling Pod 0/2 nodes are available: 2 node(s) didn't match node selector.
0s Warning FailedScheduling Pod 0/2 nodes are available: 2 node(s) didn't match node selector.
<o:p></o:p>
Thursday, June 27, 2019 3:02 PM
Also yes I see some stateful.yaml files under the templates/ directories.
I'm not sure where to make the permanent fix tis person is mentioning here:
The
nodeSelector is missing “beta.kubernetes.io/os: linux”. Please correct the
deployment by including this information.
Is this supposed to go in the helm charts?
In the values.yaml I see the lines:
## Speficies the type of anti-affinity for scheduling pods to nodes.
## If hard, pods cannot be scheduled together on nodes, if soft,
## best-effort to avoid sharing nodes will be done
nodeAntiAffinity: soft
## Node labels for pod assignment
## Ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/
nodeAffinity:
enabled: false
key: is_worker
value: true
And then there are stateful set files under the "templates" directory containing these lines. Am I supposed to modify one of these lines for the permanent fix mentioned above?
affinity:
{{- if .Values.<nameofapp>.nodeAffinity.enabled }}
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: {{ .Values.<nameofapp>.nodeAffinity.key }}
operator: In
values:
- {{ quote .Values.<nameoffapp>.nodeAffinity.value }}
{{- end }}
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
selector:
matchLabels:
{{- include "<name of app>.labels" . | indent 6 }}
Thursday, June 27, 2019 7:18 PM
I made this change to the values.yaml file but I still get the same error when I do a helm install:
I added this line:
nodeSelector: {beta.kubernetes.io/os:linux}
When I describe the pod (kubectl describe) I still see Node-Sectors: <none>, followed by the same error.
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
Warning FailedScheduling 25s (x18 over 6m20s) default-scheduler 0/2 nodes are available: 2 node(s) didn't match node selector.
To be clear the helm command I ran was:
helm install cmdb -f cmdb-version0.yaml
Where cmdb-version0.yaml is my values override file which I first got by running helm inspect values <folder> and then modifying the resulting file.