Appendix 1: HPC Cluster Networking
Windows HPC Server 2008 supports five cluster topologies designed to meet a wide range of user needs and performance, scalability, manageability, and access requirements. These topologies are distinguished by how the compute nodes in the cluster are connected to each other and to the enterprise network. Depending on the network topology that you choose for your cluster, certain network services, such as Dynamic Host Configuration Protocol (DHCP) and network address translation (NAT), can be provided by the head node to the compute nodes.
You must choose the network topology that you will use for your cluster well in advance of setting up an HPC cluster.
This section includes the following topics:
- HPC cluster networks
- Supported HPC cluster network topologies
- HPC network services
- Windows Firewall configuration
HPC cluster networks
The following table lists and describes the networks to which an HPC cluster can be connected.
Network Name | Description |
---|---|
Enterprise network |
An organizational network to which the head node is connected and optionally the compute nodes. The enterprise network is often the network that most users in an organization log on to when performing their job. All intra-cluster management and deployment traffic is carried on the enterprise network unless a private network (and optionally, an application network) also connects the cluster nodes. |
Private network |
A dedicated network that carries intra-cluster communication between nodes. This network carries management, deployment, and application traffic if no application network exists. |
Application network |
A dedicated network, preferably with high bandwidth and low latency. These characteristics are important so that this network can perform latency-sensitive tasks, such as carrying parallel Message Passing Interface (MPI) application communication between compute nodes. |
Supported HPC cluster network topologies
There are five cluster topologies supported by Windows HPC Server 2008:
- Topology 1: Compute Nodes Isolated on a Private Network
- Topology 2: All Nodes on Enterprise and Private Networks
- Topology 3: Compute Nodes Isolated on Private and Application Networks
- Topology 4: All Nodes on Enterprise, Private, and Application Networks
- Topology 5: All Nodes on an Enterprise Network
Topology 1: Compute nodes isolated on a private network
The following image illustrates how the head node and the compute nodes are connected to the cluster networks in this topology:
The following table lists and describes details about the different components in this topology:
Component | Description |
---|---|
Network adapters |
|
Traffic |
|
Network services |
|
Security |
|
Considerations when selecting this topology |
|
Topology 2: All nodes on enterprise and private networks
The following image illustrates how the head node and the compute nodes are connected to the cluster networks in this topology:
The following table lists and describes details about the different components in this topology:
Component | Description |
---|---|
Network adapters |
|
Traffic |
|
Network services |
|
Security |
|
Considerations when selecting this topology |
|
Topology 3: Compute nodes isolated on private and application networks
The following image illustrates how the head node and the compute nodes are connected to the cluster networks in this topology:
The following table lists and describes details about the different components in this topology:
Component | Description |
---|---|
Network adapters |
|
Traffic |
|
Network services |
|
Security |
|
Considerations when selecting this topology |
|
Topology 4: All nodes on enterprise, private, and application networks
The following image illustrates how the head node and the compute nodes are connected to the cluster networks in this topology:
The following table lists and describes details about the different components in this topology:
Component | Description |
---|---|
Network adapters |
|
Traffic |
|
Network services |
|
Security |
|
Considerations when selecting this topology |
|
Topology 5: All nodes on an enterprise network
The following image illustrates how the head node and the compute nodes are connected to the cluster networks in this topology:
The following table lists and describes details about the different components in this topology:
Component | Description |
---|---|
Network adapters |
|
Traffic |
|
Network services |
|
Security |
|
Considerations when selecting this topology |
|
HPC network services
Depending on the network topology that you have chosen for your HPC cluster, the following network services can be provided by the head node to the compute nodes connected to the different cluster networks:
- Network Address Translation (NAT)
- Dynamic Host Configuration Protocol (DHCP) server
This section describes these HPC network services.
Network address translation (NAT)
Network address translation (NAT) provides a method for translating Internet Protocol version 4 (IPv4) addresses of computers on one network into IPv4 addresses of computers on a different network.
Enabling NAT on the head node enables compute nodes on the private or application networks to access resources on the enterprise network. You do not need to enable NAT if you have another server providing NAT or routing services on the private or application networks. Also, you do not need NAT if all nodes are connected to the enterprise network.
DHCP server
A DHCP server assigns IP addresses to network clients. Depending on the detected configuration of your HPC cluster and the network topology that you choose for your cluster, the compute nodes will receive IP addresses from either the head node running DHCP, or from a dedicated DHCP server on the private network, or via DHCP services coming from a server on the enterprise network.
Windows Firewall configuration
Windows HPC Server 2008 opens firewall ports on the head node and compute nodes to enable internal services to run. By default, Windows Firewall is enabled only on the enterprise network, and disabled on the private and application networks to provide the best performance and manageability experience.
Important
If you have applications that require access to the head node or to the cluster nodes on specific ports, you will have to manually open those ports in Windows Firewall.
Firewall ports required by Windows HPC Server 2008
The following table lists all the ports that are opened by Windows HPC Server 2008 for communication between cluster services on the head node and the compute nodes.
Port Number (TCP) | Required By |
---|---|
5969 |
Required by the client tools on the enterprise network to connect to the HPC Job Scheduler Service on the head node. |
9892, 9893 |
Used by the HPC Management Service on the compute nodes to communicate with the HPC System Definition Model (SDM) Service on the head node. |
5970 |
Used for communication between the HPC Management Service on the compute nodes and the HPC Job Scheduler Service on the head node. |
9794 |
Used for communication between ExecutionClient.exe on the compute nodes and the HPC Management Service on the head node. ExecutionClient.exe is used during the deployment process of a compute node. It performs tasks such as imaging the computer, installing all the necessary HPC components, and joining the computer to the domain. |
9087, 9088, 9089 |
Used for communication between the client application on the enterprise network and the services provided by the Windows Communication Foundation (WCF) broker node. |
1856 |
Used by the HPC Job Scheduler Service on the head node to communicate with the HPC Node Manager Service on the compute nodes. |
8677 |
Used for communication between the HPC MPI Service on the head node and the HPC MPI Service on the compute nodes. |
6729 |
Used for management services traffic coming from the compute nodes to the head node or WCF broker node. |
5800 |
Used for communication between the HPC command-line tools on the enterprise network and the HPC Job Scheduler Service on the head node. |
5801 |
Used by the remote node service on the enterprise network to enumerate nodes in a node group, or to bring a node online or take it offline. |
5999 |
Used by HPC Cluster Manager on the enterprise network to communicate with the HPC Job Scheduler Service on the head node. |
443 |
Used by the clients on the enterprise network to connect to the HPC Basic Profile Web Service on the head node. |