Connect to Azure Kubernetes Service (AKS) cluster nodes for maintenance or troubleshooting
Throughout the lifecycle of your Azure Kubernetes Service (AKS) cluster, you eventually need to directly access an AKS node. This access could be for maintenance, log collection, or troubleshooting operations.
You access a node through authentication, which methods vary depending on your Node OS and method of connection. You securely authenticate against AKS Linux and Windows nodes through two options discussed in this article. One requires that you have Kubernetes API access, and the other is through the AKS ARM API, which provides direct private IP information. For security reasons, AKS nodes aren't exposed to the internet. Instead, to connect directly to any AKS nodes, you need to use either kubectl debug
or the host's private IP address.
Access nodes using the Kubernetes API
This method requires usage of kubectl debug
command.
Before you begin
This guide shows you how to create a connection to an AKS node and update the SSH key of your AKS cluster. To follow along the steps, you need to use Azure CLI that supports version 2.0.64 or later. Run az --version
to check the version. If you need to install or upgrade, see Install Azure CLI.
Complete these steps if you don't have an SSH key. Create an SSH key depending on your Node OS Image, for macOS and Linux, or Windows. Make sure you save the key pair in the OpenSSH format, avoid unsupported formats such as .ppk
. Next, refer to Manage SSH configuration to add the key to your cluster.
Linux and macOS
Linux and macOS users can access their node using kubectl debug
or their private IP Address. Windows users should skip to the Windows Server Proxy section for a workaround to SSH via proxy.
Connect using kubectl debug
To create an interactive shell connection, use the kubectl debug
command to run a privileged container on your node.
To list your nodes, use the
kubectl get nodes
command:kubectl get nodes -o wide
Sample output:
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE aks-nodepool1-37663765-vmss000000 Ready agent 166m v1.25.6 10.224.0.33 <none> Ubuntu 22.04.2 LTS aks-nodepool1-37663765-vmss000001 Ready agent 166m v1.25.6 10.224.0.4 <none> Ubuntu 22.04.2 LTS aksnpwin000000 Ready agent 160m v1.25.6 10.224.0.62 <none> Windows Server 2022 Datacenter
Use the
kubectl debug
command to start a privileged container on your node and connect to it.kubectl debug node/aks-nodepool1-37663765-vmss000000 -it --image=mcr.microsoft.com/cbl-mariner/busybox:2.0
Sample output:
Creating debugging pod node-debugger-aks-nodepool1-37663765-vmss000000-bkmmx with container debugger on node aks-nodepool1-37663765-vmss000000. If you don't see a command prompt, try pressing enter. root@aks-nodepool1-37663765-vmss000000:/#
You now have access to the node through a privileged container as a debugging pod.
Note
You can interact with the node session by running
chroot /host
from the privileged container.
Exit kubectl debug mode
When you're done with your node, enter the exit
command to end the interactive shell session. After the interactive container session closes, delete the debugging pod used with kubectl delete pod
.
kubectl delete pod node-debugger-aks-nodepool1-37663765-vmss000000-bkmmx
Windows Server proxy connection for SSH
Follow these steps as a workaround to connect with SSH on a Windows Server node.
Create a proxy server
At this time, you can't connect to a Windows Server node directly by using kubectl debug
. Instead, you need to first connect to another node in the cluster with kubectl
, then connect to the Windows Server node from that node using SSH.
To connect to another node in the cluster, use the kubectl debug
command. For more information, follow the above steps in the kubectl section. Create an SSH connection to the Windows Server node from another node using the SSH keys provided when you created the AKS cluster and the internal IP address of the Windows Server node.
Important
The following steps for creating the SSH connection to the Windows Server node from another node can only be used if you created your AKS cluster using the Azure CLI with the --generate-ssh-keys
parameter. If you want to use your own SSH keys instead, you can use the az aks update
to manage SSH keys on an existing AKS cluster. For more information, see manage SSH node access.
Note
If your Linux proxy node is down or unresponsive, use the Azure Bastion method to connect instead.
Use the
kubectl debug
command to start a privileged container on your proxy (Linux) node and connect to it.kubectl debug node/aks-nodepool1-37663765-vmss000000 -it --image=mcr.microsoft.com/cbl-mariner/busybox:2.0
Sample output:
Creating debugging pod node-debugger-aks-nodepool1-37663765-vmss000000-bkmmx with container debugger on node aks-nodepool1-37663765-vmss000000. If you don't see a command prompt, try pressing enter. root@aks-nodepool1-37663765-vmss000000:/#
Open a new terminal window and use the
kubectl get pods
command to get the name of the pod started bykubectl debug
.kubectl get pods
Sample output:
NAME READY STATUS RESTARTS AGE node-debugger-aks-nodepool1-37663765-vmss000000-bkmmx 1/1 Running 0 21s
In the sample output, node-debugger-aks-nodepool1-37663765-vmss000000-bkmmx is the name of the pod started by
kubectl debug
.Use the
kubectl port-forward
command to open a connection to the deployed pod:kubectl port-forward node-debugger-aks-nodepool1-37663765-vmss000000-bkmmx 2022:22
Sample output:
Forwarding from 127.0.0.1:2022 -> 22 Forwarding from [::1]:2022 -> 22
The previous example begins forwarding network traffic from port
2022
on your development computer to port22
on the deployed pod. When usingkubectl port-forward
to open a connection and forward network traffic, the connection remains open until you stop thekubectl port-forward
command.Open a new terminal and run the command
kubectl get nodes
to show the internal IP address of the Windows Server node:kubectl get no -o custom-columns=NAME:metadata.name,'INTERNAL_IP:status.addresses[?(@.type == \"InternalIP\")].address'
Sample output:
NAME INTERNAL_IP aks-nodepool1-19409214-vmss000003 10.224.0.8
In the previous example, 10.224.0.62 is the internal IP address of the Windows Server node.
Create an SSH connection to the Windows Server node using the internal IP address, and connect to port
22
through port2022
on your development computer. The default username for AKS nodes is azureuser. Accept the prompt to continue with the connection. You're then provided with the bash prompt of your Windows Server node:ssh -o 'ProxyCommand ssh -p 2022 -W %h:%p [email protected]' [email protected]
Sample output:
The authenticity of host '10.224.0.62 (10.224.0.62)' can't be established. ECDSA key fingerprint is SHA256:1234567890abcdefghijklmnopqrstuvwxyzABCDEFG. Are you sure you want to continue connecting (yes/no)? yes
Note
If you prefer to use password authentication, include the parameter
-o PreferredAuthentications=password
. For example:ssh -o 'ProxyCommand ssh -p 2022 -W %h:%p [email protected]' -o PreferredAuthentications=password [email protected]
Use Host Process Container to access Windows node
Create
hostprocess.yaml
with the following content and replacingAKSWINDOWSNODENAME
with the AKS Windows node name.apiVersion: v1 kind: Pod metadata: labels: pod: hpc name: hpc spec: securityContext: windowsOptions: hostProcess: true runAsUserName: "NT AUTHORITY\\SYSTEM" hostNetwork: true containers: - name: hpc image: mcr.microsoft.com/windows/servercore:ltsc2022 # Use servercore:1809 for WS2019 command: - powershell.exe - -Command - "Start-Sleep 2147483" imagePullPolicy: IfNotPresent nodeSelector: kubernetes.io/os: windows kubernetes.io/hostname: AKSWINDOWSNODENAME tolerations: - effect: NoSchedule key: node.kubernetes.io/unschedulable operator: Exists - effect: NoSchedule key: node.kubernetes.io/network-unavailable operator: Exists - effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists
Run
kubectl apply -f hostprocess.yaml
to deploy the Windows host process container (HPC) in the specified Windows node.Use
kubectl exec -it [HPC-POD-NAME] -- powershell
.You can run any PowerShell commands inside the HPC container to access the Windows node.
Note
You need to switch the root folder to C:\
inside the HPC container to access the files in the Windows node.
SSH using Azure Bastion for Windows
If your Linux proxy node isn't reachable, using Azure Bastion as a proxy is an alternative. This method requires that you set up an Azure Bastion host for the virtual network in which the cluster resides. See Connect with Azure Bastion for more details.
SSH using private IPs from the AKS API (preview)
If you don't have access to the Kubernetes API, you can get access to properties such as Node IP
and Node Name
through the AKS agent pool API (preview), (available on preview versions 07-02-2023
or above) to connect to AKS nodes.
Important
AKS preview features are available on a self-service, opt-in basis. Previews are provided "as is" and "as available," and they're excluded from the service-level agreements and limited warranty. AKS previews are partially covered by customer support on a best-effort basis. As such, these features aren't meant for production use. For more information, see the following support articles:
Create an interactive shell connection to a node using the IP address
For convenience, AKS nodes are exposed on the cluster's virtual network through private IP addresses. However, you need to be in the cluster's virtual network to SSH into the node. If you don't already have an environment configured, you can use Azure Bastion to establish a proxy from which you can SSH to cluster nodes. Make sure the Azure Bastion is deployed in the same virtual network as the cluster.
Obtain private IPs using the
az aks machine list
command, targeting all the VMs in a specific node pool with the--nodepool-name
flag.az aks machine list --resource-group myResourceGroup --cluster-name myAKSCluster --nodepool-name nodepool1 -o table
The following example output shows the internal IP addresses of all the nodes in the node pool:
Name Ip Family --------------------------------- ----------- ----------- aks-nodepool1-33555069-vmss000000 10.224.0.5 IPv4 aks-nodepool1-33555069-vmss000001 10.224.0.6 IPv4 aks-nodepool1-33555069-vmss000002 10.224.0.4 IPv4
To target a specific node inside the node pool, use the
--machine-name
flag:az aks machine show --cluster-name myAKScluster --nodepool-name nodepool1 -g myResourceGroup --machine-name aks-nodepool1-33555069-vmss000000 -o table
The following example output shows the internal IP address of all the specified node:
Name Ip Family --------------------------------- ----------- ----------- aks-nodepool1-33555069-vmss000000 10.224.0.5 IPv4
SSH to the node using the private IP address you obtained in the previous step. This step is applicable for Linux machines only. For Windows machines, see Connect with Azure Bastion.
ssh -i /path/to/private_key.pem [email protected]
Next steps
If you need more troubleshooting data, you can view the kubelet logs or view the Kubernetes control plane logs.
To learn about managing your SSH keys, see Manage SSH configuration.
Azure Kubernetes Service