Cluster of two windows 10 computers. Desktop cluster. Network settings requirements

I built my first “cluster” of single-board computers almost immediately after the Orange Pi PC microcomputer began to gain popularity. It could hardly be called a “cluster”, because from a formal point of view it was just a local network of four boards that “saw” each other and could access the Internet.

The device participated in the project [email protected] and even managed to count something. But, unfortunately, no one came to pick me up from this planet.
However, all the time fiddling with wires, connectors and microSD cards, I learned a lot. So, for example, I found out that you should not trust the declared power of the power supply, that it would be nice to distribute the load in terms of consumption, and the wire cross-section matters.

And yes, I had to “collectively farm” the power management system, because the simultaneous start of five single-boarders may require a starting current of the order of 8-10A (5 * 2)! This is a lot, especially for power supplies made in the basements of the country, where we so love to order all sorts of ... interesting gadgets.

I'll probably start with her. The task was reduced to relatively simple actions - after a given period of time, turn on 4 channels in series, through which 5 volts are supplied. The easiest way to implement your plan is with Arduino (which every self-respecting geek has in abundance) and here is such a miracle board with Ali with 4 relays.

And you know, it even worked.

However, the “refrigerator-style” clicks at start-up caused some rejection. Firstly, when clicking, a noise ran through the power supply and it was necessary to install capacitors, and secondly, the whole structure was quite large.

So one day I just replaced the relay box with IRL520 based transistor switches.

This solved the problem with interference, but since the mosfet controls “zero”, I had to abandon the brass legs in the rack so as not to accidentally connect the ground of the boards.

And now, the solution is perfectly replicated and already two clusters are working stably without any surprises. Just as planned.

But, back to replicability. Why buy power supplies for a significant amount when there are many affordable ATXs literally under your feet?
Moreover, they have all the voltages (5,12,3.3), the beginnings of self-diagnosis and the possibility of program control.

Well, here I won’t especially crucify - an article about ATX control through Arduino.

Well, all the pills are eaten, the stamps are also pasted? It's time to put it all together.

There will be one head node that connects to the outside world via WiFi and gives "internets" to the cluster. It will be powered by ATX standby voltage.

In fact, TBNG is responsible for the distribution of the Internet.
So, if desired, cluster nodes can be hidden behind TOR.

Also, there will be a tricky board connected via i2c to this head node. She will be able to turn on and off each of the 10 worker nodes. Plus, it will be able to control three 12v fans to cool the entire system.

The scenario is as follows - when ATX is turned on at 220v, the head node starts. When the system is ready for operation, it sequentially turns on all 10 nodes and fans.
When the power-on process is completed, the head node will bypass each working node and ask how we feel, what the temperature is supposedly. If one of the racks is heated, increase the airflow.
Well, with the shutdown command, each of the nodes will be carefully extinguished and de-energized.

I drew the board diagram myself, so it looks creepy. However, a well-trained person took over the tracing and manufacturing, for which many thanks to him.

Here it is in the process of being assembled.

Here is one of the first sketches of the location of the cluster components. Made on a piece of paper in a cage and immortalized through Office Lens with a phone.

The whole structure is placed on a sheet of textolite, bought for the occasion.

This is what the arrangement of the nodes inside looks like. Two racks of five boards.

Here you can see the control Arduino. It is connected to the main Orange Pi Pc via i2c through a level converter.

Well, here is the final (current version).

So, all you need is to write a few Python utilities that would conduct all this music - turn it on, turn it on, adjust the fan speed.

I won't bore you with the technical details - it looks something like this:

1
2
3
4
5
6
7
8

#!/usr/bin/env sh

echo "Starting ATX board..."
/home/zno/i2creobus/i2catx_tool.py --start
echo "Setting initial fan values..."
/home/zno/i2creobus/i2creobus_tool.py --fan 0 --set 60
/home/zno/i2creobus/i2creobus_tool.py --fan 1 --set 60
/home/zno/i2creobus/i2creobus_tool.py --fan 2 --set 60

Since we already have as many as 10 nodes, we are adopting Ansible, which will help, for example, to shut down all nodes correctly. Or run on each temperature monitor.

1
2
3
4
5
6
7
8

---

- hosts: workers
roles:
- webmon_stop
-webmon_remove
- webmon_install
- webmon_start

I am often accused of a dismissive tone, saying that this is just a local network of single-payers (as I mentioned at the very beginning). In general, I don’t give a shit about someone else’s opinion, but perhaps we’ll add some glamor and organize a docker swarm cluster.
The task is very simple and takes less than 10 minutes. Then we run an instance of Portainer on the head node, and voila!

Now you can really scale tasks. So, at the moment, the Verium Reserve cryptocurrency miner is working in the cluster. And quite successfully. I hope the nearest native will pay for the electricity consumed;) Well, or reduce the number of nodes involved and mine something else like Turtle Coin.

If you want a payload, you can throw Hadoop into the cluster or arrange balancing of web servers. There are many ready-made images on the Internet, and there is enough training material. Well, if the image (docker image) is missing, you can always build your own.

What did it teach me? In general, the “stack” of technologies is very wide. Judge for yourself - Docker, Ansible, Python, Arduino upgrade (God forgive me, it will not be said by night), and of course, the shell. As well as KiCad and work with a contractor :).

What can be done better? Much. On the software side, it would be nice to rewrite control utilities in Go. On the iron - make it more steampunkish - KDPV at the beginning perfectly raises the bar. So there is something to work on.

Roles played:

The headend is an Orange Pi PC with usb wifi.
Working nodes - Orange Pi PC2 x 10.
Network - 100 Mbps [email protected]
Brain - Arduino clone based on Atmega8 + level converter.
The heart is an ATX power controller with a power supply.
Soft (soul) - Docker, Ansible, Python 3, a little shell and a little bit of laziness.
The time spent is priceless.

During the experiments, a couple of Orange Pi PC2 boards suffered due to a mixed up power supply (they burn very nicely), another PC2 lost its Ethernet (this is a separate story in which I do not understand the physics of the process).

That seems to be the whole story “on top”. If someone finds it interesting - ask questions in the comments. And vote there for questions (plus - each comment has a button for this). The most interesting questions will be covered in new notes.
Thank you for reading to the end.

Introduction

A server cluster is a group of independent servers managed by the cluster service that work together as a single system. Server clusters are created by bringing multiple Windows® 2000 Advanced Server and Windows 2000 Datacenter Server-based servers together to provide high availability, scalability, and manageability for resources and applications.

The task of a server cluster is to provide continuous user access to applications and resources in cases of hardware or software failures or planned equipment shutdowns. If one of the cluster servers becomes unavailable due to a failure or shutdown for maintenance, information resources and applications are redistributed among the remaining available cluster nodes.

For cluster systems, the use of the term " high availability" is preferred over using the term " fault tolerance" because fault-tolerance technologies require a higher level of hardware resilience and recovery mechanisms. As a rule, fault-tolerant servers use a high degree of hardware redundancy, plus in addition to this, specialized software that allows almost immediately recovery in the event of any single software or hardware failure. These solutions are significantly more expensive than using cluster technologies, as organizations are forced to overpay for additional hardware that is idle most of the time and is used only in case of failures. Fault-tolerant servers are used for high-value transaction intensive applications such as payment processors, ATMs or stock exchanges.

Although the Cluster service is not guaranteed to run non-stop, it provides a high level of availability that is sufficient to run most mission-critical applications. The Cluster service can monitor the operation of applications and resources, automatically recognizing the state of failures and recovering the system after they are resolved. This provides more flexible workload management within the cluster, and improves overall system availability.

Key benefits of using the Cluster service:

High availability. In the event of a node failure, the cluster service transfers control of resources, such as hard disks and network addresses, to the active cluster node. When a software or hardware failure occurs, the cluster software restarts the failed application on the live node, or shifts the entire load of the failed node to the remaining live nodes. In this case, users may notice only a short delay in service.
Return after cancellation. The Cluster service automatically redistributes the workload across the cluster when a failed node becomes available again.
Controllability. Cluster Administrator is a snap-in that you can use to manage the cluster as a single system, as well as to manage applications. The cluster administrator provides a transparent view of how applications work as if they were running on the same server. You can move applications to different servers within a cluster by dragging and dropping cluster objects. In the same way, you can move data. This method can be used to manually distribute the server workload, as well as to offload the server and then stop it for scheduled maintenance. In addition, the Cluster Administrator allows you to remotely monitor the state of the cluster, all its nodes and resources.
Scalability. To ensure that cluster performance can always keep up with growing demands, the Cluster service is designed to scale. If the overall performance of the cluster becomes insufficient to handle the load generated by the clustered applications, additional nodes can be added to the cluster.

This document provides instructions for installing the Cluster service on servers running Windows 2000 Advanced Server and Windows 2000 Datacenter Server and describes how to install the Cluster service on cluster node servers. This guide does not cover installing and configuring clustered applications, but only walks you through the installation process of a simple two-node cluster.

System requirements for creating a server cluster

The following checklists will help you prepare for installation. Step by step installation instructions will be provided further after these listings.

Software Requirements

Microsoft Windows 2000 Advanced Server or Windows 2000 Datacenter Server operating system installed on all servers in the cluster.
An installed name resolution service such as Domain Naming System (DNS), Windows Internet Naming System (WINS), HOSTS, etc.
Terminal server for remote cluster administration. This requirement is not mandatory, but is recommended only to ensure the convenience of cluster management.

Hardware Requirements

The hardware requirements for a cluster node are the same as those for installing the Windows 2000 Advanced Server or Windows 2000 Datacenter Server operating systems. These requirements can be found on the search page Microsoft directory.
The cluster hardware must be certified and listed on the Microsoft Cluster Service Hardware Compatibility List (HCL). The latest version of this list can be found on the search page Windows 2000 Hardware Compatibility List Microsoft directory by selecting the "Cluster" search category.

Two HCL-qualified computers, each with:

A hard drive with a bootable system partition and Windows 2000 Advanced Server or Windows 2000 Datacenter Server installed. This drive must not be connected to the shared storage bus discussed below.
Separate PCI-controller devices optical channel (Fiber Channel) or SCSI for connecting an external shared storage device. This controller must be present in addition to the boot disk controller.
Two PCI network adapters installed on each computer in the cluster.
The external disk storage device listed in the HCL that is attached to all nodes in the cluster. It will act as a cluster disk. A configuration using hardware RAID arrays is recommended.
Cables for connecting a shared storage device to all computers. Refer to the manufacturer's documentation for instructions on configuring storage devices. If you are connecting to a SCSI bus, you can refer to Appendix A for more information.
All hardware on the cluster computers must be completely identical. This will simplify the configuration process and save you from potential compatibility issues.

Network Configuration Requirements

Unique NetBIOS name for the cluster.
Five unique static IP addresses: two for private network adapters, two for public network adapters, and one for the cluster.
Domain account for the cluster service (all cluster nodes must be members of the same domain)
Each node must have two network adapters - one for connecting to the public network, one for intra-cluster communication of nodes. A configuration using a single network adapter to connect to a public and private network at the same time is not supported. A separate network adapter for the private network is required to comply with HCL requirements.

Requirements for shared storage drives

All shared storage drives, including the quorum drive, must be physically connected to the shared bus.
All disks connected to the shared bus must be available to each node. This can be verified during the installation and configuration phase of the host adapter. Refer to the adapter manufacturer's documentation for detailed instructions.
SCSI devices must be assigned target unique SCSI ID numbers, and the terminators on the SCSI bus must be properly terminated, according to the manufacturer's instructions. one
All shared storage disks must be configured as basic disks (not dynamic)
All partitions on shared storage drives must be formatted with the NTFS file system.

It is highly recommended that all shared storage drives be configured into hardware RAID arrays. Although not required, creating fault-tolerant RAID configurations is key to protecting against disk failures.

Installing a cluster

General overview of the installation

During the installation process, some nodes will be shut down and some will be rebooted. This is necessary in order to ensure the integrity of data located on disks connected to the common bus of an external storage device. Data corruption can occur when multiple nodes simultaneously attempt to write to the same drive that is not protected by the cluster software.

Table 1 will help you determine which nodes and storage devices must be enabled for each step of the installation.

This guide describes how to create a two-node cluster. However, if you are setting up a cluster with more than two nodes, you can use the column value "Node 2" to determine the state of the remaining nodes.

Table 1. Sequence of enabling devices during cluster installation

Step	Node 1	Node 2	storage device	A comment
Setting network parameters	On	On	Off	Make sure all storage devices connected to the shared bus are turned off. Turn on all nodes.
Setting up shared drives	On	Off	On	Turn off all nodes. Power on the shared storage device, then power on the first host.
Checking the configuration of shared drives	Off	On	On	Turn off the first node, turn on the second node. Repeat for knots 3 and 4 if necessary.
Configuring the first node	On	Off	On	Turn off all nodes; turn on the first node.
Configuring the Second Node	On	On	On	After successfully configuring the first node, power on the second node. Repeat for knots 3 and 4 if necessary.
Completing the installation	On	On	On	At this point, all nodes should be enabled.

Before installing the cluster software, you must complete the following steps:

Install Windows 2000 Advanced Server or Windows 2000 Datacenter Server on each computer in the cluster.
Configure network settings.
Set up shared storage drives.

Complete these steps on each node of the cluster before installing the Cluster service on the first node.

To configure the Cluster service on a server running Windows 2000, your account must have administrative rights on each node. All cluster nodes must be either member servers or controllers of the same domain at the same time. Mixed use of member servers and domain controllers in a cluster is not allowed.

Installing the Windows 2000 operating system

To install Windows 2000 on each cluster node, refer to the documentation that came with your operating system.

This document uses the naming structure from the manual "Step-by-Step Guide to a Common Infrastructure for Windows 2000 Server Deployment". However, you can use any names.

You must be logged in with an administrator account before starting the installation of the cluster service.

Configuring network settings

Note: At this point in the installation, turn off all shared storage devices, and then turn on all nodes. You must prevent multiple nodes from accessing shared storage at the same time until the Cluster service is installed on at least one of the nodes and that node is powered on.

Each node must have at least two network adapters installed - one to connect to the public network and one to connect to the private network of the cluster nodes.

The private network network adapter provides communication between nodes, communication of the current state of the cluster, and management of the cluster. Each node's public network adapter connects the cluster to the public network of client computers.

Make sure all network adapters are physically connected correctly: private network adapters are only connected to other private network adapters, and public network adapters are connected to public network switches. The connection diagram is shown in Figure 1. Perform this check on each node of the cluster before proceeding to configure the shared storage drives.

Figure 1: An example of a two-node cluster

Configuring a private network adapter

Perform these steps on the first node of your cluster.

My network environment and select command Properties.
Right click on the icon.

Note: Which network adapter will serve the private network and which one the public network depends on the physical connection of the network cables. In this document, we will assume that the first adapter (Local Area Connection) is connected to the public network and the second adapter (Local Area Connection 2) is connected to the cluster's private network. In your case, this may not be the case.

State. Window Status Local Area Connection 2 shows the connection status and its speed. If the connection is in a disconnected state, check the cables and the correct connection. Fix the issue before continuing. Click the button close.
Right click on the icon again LAN connection 2, select a command Properties and press the button Tune.
Select tab Additionally. The window shown in Figure 2 will appear.
For private network network adapters, the speed must be set manually instead of the default value. Specify the speed of your network in the drop-down list. Don't use values "Auto Sense" or "Auto Select" to select the speed, since some network adapters may drop packets during the determination of the connection speed. To set the speed of the network adapter, specify the actual value for the parameter Connection type or Speed.

Figure 2: Network adapter advanced settings

All network adapters in a cluster connected to the same network must be configured in the same way and use the same parameter values duplex mode, flow control, Connection type, etc. Even if different nodes use different network equipment, the values of these parameters must be the same.

Select Internet Protocol (TCP/IP) in the list of components used by the connection.
Click the button Properties.
Set the switch to Use the following IP address and enter the address 10.1.1.1 . (For the second node, use the address 10.1.1.2 ).
Set the subnet mask: 255.0.0.0 .
Click the button Additionally and select the tab WINS. Set the switch value to position Disable NetBIOS over TCP/IP. Click OK to return to the previous menu. Follow this step only for the private network adapter.

Your dialog box should look like Figure 3.

Figure 3: Private network connection IP address

Configuring a public network adapter

Note: If a DHCP server is running on a public network, an IP address for the public network adapter may be assigned automatically. However, this method is not recommended for cluster node adapters. We strongly recommend that you assign permanent IP addresses to all public and private host NICs. Otherwise, if the DHCP server fails, access to the cluster nodes may not be possible. If you are forced to use DHCP for public network adapters, use long address leases to ensure that the dynamically assigned address remains valid even if the DHCP server becomes temporarily unavailable. Always assign permanent IP addresses to private network adapters. Keep in mind that the Cluster service can only recognize one network interface per subnet. If you need help with assigning network addresses in Windows 2000, see the operating system's built-in help.

Renaming network connections

For clarity, we recommend that you change the names of your network connections. For example, you can change the name of the connection LAN connection 2 on the . This method will help you identify networks more easily and assign their roles correctly.

Right click on the icon 2.
In the context menu, select the command Rename.
Enter Connecting to a private cluster network in the text field and press the key ENTER.
Repeat steps 1-3 and change the name of the connection LAN connection on the Connection to a public network.

Figure 4: Renamed network connections

The renamed network connections should look like Figure 4. Close the window Network and Dial-Up Networking. New network connection names are automatically replicated to other cluster nodes when they are powered up.

Checking Network Connections and Name Resolutions

To verify that the configured network hardware is working, complete the following steps for all network adapters in each host. To do this, you must know the IP addresses of all network adapters in the cluster. You can get this information by running the command ipconfig on each node:

Click the button Start, select a team Run and type the command cmd in a text window. Click OK.
Dial a team ipconfig /all and press the key ENTER. You will see information about the IP protocol setting for each network adapter on the local machine.
If you don't have a command line window open yet, follow step 1.
Dial a team ping ipaddress where ipaddress is the IP address of the corresponding network adapter on the other host. Assume for example that the network adapters have the following IP addresses:

Node number	Network connection name	Network adapter IP address
1	Public network connection	172.16.12.12
1	Connecting to a private cluster network	10.1.1.1
2	Public network connection	172.16.12.14
2	Connecting to a private cluster network	10.1.1.2

In this example, you need to run the commands ping 172.16.12.14 and ping 10.1.1.2 from node 1, and execute commands ping 172.16.12.12 and ping 10.1.1.1 from node 2.

To check name resolution, run the command ping, using the computer name as the argument instead of its IP address. For example, to check name resolution for the first cluster node named hq-res-dc01, run the command ping hq-res-dc01 from any client computer.

Domain membership check

All nodes in the cluster must be members of the same domain and must be able to network with the domain controller and DNS server. The nodes can be configured as domain member servers or as controllers of the same domain. If you decide to make one of the nodes a domain controller, then all other nodes in the cluster must also be configured as domain controllers of the same domain. This guide assumes that all nodes are domain controllers.

Note: For links to additional documentation on configuring domains, DNS, and DHCP services in Windows 2000, see Related Resources at the end of this document.

Right click My computer and select command Properties.
Select tab Network identification. In the dialog box System properties You will see the full computer and domain name. In our example, the domain is called reskit.com.
If you have configured the node as a member server, then you can join it to the domain at this point. Click the button Properties and follow the instructions to join the computer to the domain.
close the windows System properties and My computer.

Create a Cluster Service Account

For the cluster service, you must create a separate domain account under which it will run. The installer will require you to enter credentials for the Cluster service, so an account must be created before the service can be installed. The account must not be owned by any domain user, and must be used exclusively for running the Cluster service.

Click the button Start, select a command Programs / Administration, start snap .
Expand Category reskit.com if it is not already deployed
Select from the list Users.
Right click on Users, select from the context menu Create, select User.
Enter a name for the cluster service account, as shown in Figure 5, and click Further.

Figure 5: Adding a Cluster User

Check the boxes Prevent user from changing password and Password does not expire. Click the button Further and button Ready to create a user.

Note: If your administrative security policy does not allow the use of passwords that never expire, you will need to update the password and configure the Cluster service on each node before it expires.

Right click on the user Cluster in the right toolbar Active Directory Users and Computers.
In the context menu, select the command Add members to a group.
Choose a group Administrators and press OK. The new account now has administrator privileges on the local computer.
close snap Active Directory Users and Computers.

Configuring Shared Storage Drives

Warning: Ensure that at least one of the cluster nodes is running Windows 2000 Advanced Server or Windows 2000 Datacenter Server and that the cluster service is configured and running. Only then can you boot the Windows 2000 operating system on the remaining nodes. If these conditions are not met, the cluster disks may be damaged.

To start configuring shared storage drives, turn off all nodes. After that, turn on the shared storage device, then turn on node 1.

Quorum Disk

The quorum disk is used to store the checkpoints and restore log files of the cluster database, providing cluster management. We make the following recommendations for creating a quorum disk:

Create a small partition (at least 50MB in size) to use as the quorum disk. We generally recommend creating a 500 MB quorum disk.
Allocate a separate disk for the quorum resource. Since the entire cluster will fail if the quorum disk fails, we strongly recommend using a hardware RAID array.

During the installation of the Cluster service, you will need to assign a drive letter to the quorum. In our example, we will use the letter Q.

Configuring Shared Storage Drives

Right click My computer, select a command Control. Expand the category in the window that opens. storage devices.
Choose a team Disk Management.
Make sure all shared storage drives are formatted with NTFS and have the status Basic. If you connect a new drive, it will automatically start Disk Signing and Update Wizard. When the wizard starts, click the button Refresh, to continue its work, after that the drive will be defined as Dynamic. To convert a disk to basic, right-click on Disk #(where # - the number of the disk you are working with) and select the command Revert to base disk.

Right click area not allocated next to the corresponding disk.

Choose a team Create section
will start Partition Wizard. Double click the button Further.
Enter the desired partition size in megabytes and click the button Further.
Click the button Further, accepting the default drive letter
Click the button Further to format and create a partition.

Assign drive letters

After the data bus, disks, and shared storage partitions are configured, you must assign drive letters to all partitions on all disks in the cluster.

Note: Mount points are a file system feature that allows you to mount a file system using existing directories without assigning a drive letter. Mount points are not supported by clusters. Any external drive used as a cluster resource must be partitioned into NTFS partitions, and these partitions must be assigned drive letters.

Right-click the desired partition and select command Change Drive Letter and Drive Path.
Choose a new drive letter.
Repeat steps 1 and 2 for all shared storage drives.

Figure 6: Drive partitions with assigned letters

At the end of the procedure, the snap window Computer management should look like Figure 6. Close the snap Computer management.

Click the button Start, select Programs / Standard, and run the program Notebook".
Type a few words and save the file with a name test.txt by choosing the command Save as from the menu File. close Notebook.
Double click on the icon My documents.
Right click on file test.txt and in the context menu select the command Copy.
Close the window.
Open My computer.
Double-click on the disk partition of the shared storage device.
Right click and select command Insert.
A copy of the file should appear on the shared storage drive test.txt.
Double click on the file test.txt to open it from a shared storage drive. Close the file.
Highlight the file and press the key Del to remove the file from the cluster disk.

Repeat the procedure for all disks in the cluster to ensure they are accessible from the first node.

Now turn off the first node, turn on the second node and repeat the steps of the section Checking the operation and sharing of disks. Perform the same steps on all additional nodes. Once you have verified that all nodes can read and write information to the shared storage disks, turn off all but the first node and proceed to the next section.

Press center

Creating a cluster based on Windows 2000/2003. Step by step

A cluster is a group of two or more servers that work together to provide uptime for a set of applications or services and are perceived by the client as a single entity. Cluster nodes are interconnected using network hardware, shared resources, and server software.

Microsoft Windows 2000/2003 supports two clustering technologies: Network Load Balancing clusters and server clusters.

In the first case (load-balancing clusters), Network Load Balancing makes services and applications highly reliable and scalable by combining up to 32 servers into a single cluster. Requests from clients in this case are distributed among the cluster nodes in a transparent manner. When a node fails, the cluster automatically changes its configuration and switches the client to any of the available nodes. This cluster configuration mode is also called active-active mode, where a single application runs on multiple nodes.

The server cluster distributes its load among the servers in the cluster, with each server carrying its own load. If a node in the cluster fails, applications and services configured to run in the cluster are transparently restarted on any of the free nodes. Server clusters use shared disks to communicate within the cluster and to provide transparent access to the cluster's applications and services. They require special hardware, but this technology provides a very high level of reliability because the cluster itself does not have any single point of failure. This cluster configuration mode is also called active-passive mode. An application in a cluster runs on a single node with shared data located on external storage.

The cluster approach to organizing an internal network provides the following advantages:

High Availability That is, if a service or application fails on a node of a cluster that is configured to work together in a cluster, the cluster software allows that application to be restarted on another node. At the same time, users will experience a short delay during some operation, or they will not notice a server failure at all. Scalability For applications running in a cluster, adding servers to the cluster means increasing the capabilities: fault tolerance, load balancing, etc. Manageability Administrators, using a single interface, can manage applications and services, set failure response in a cluster node, distribute load among nodes cluster and remove the load from the nodes for preventive maintenance.

In this article, I will try to collect my experience in creating Windows-based cluster systems and give a short step-by-step guide to creating a two-node server cluster with shared data storage.

Software Requirements

Microsoft Windows 2000 Advanced (Datacenter) Server or Microsoft Windows 2003 Server Enterprise Edition installed on all servers in the cluster.
Installed DNS service. I'll explain a little. If you are building a cluster based on two domain controllers, then it is much more convenient to use the DNS service, which you set up when you created the Active Directory anyway. If you're creating a cluster based on two servers that are members of a Windows NT domain, then you'll either need to use WINS or match machine names and addresses in the hosts file.
Terminal Services for remote server management. Not necessarily, but if you have Terminal Services, it is convenient to manage servers from your workplace.

Hardware Requirements

The best choice of hardware for a cluster node is based on the Cluster Service Hardware Compatible List (HCL). Microsoft recommends that hardware be tested for compatibility with Cluster Services.
Accordingly, you will need two servers with two network adapters; SCSI adapter with an external interface for connecting an external data array.
An external array that has two external interfaces. Each of the cluster nodes is connected to one of the interfaces.

Comment: to create a two-node cluster, it is not necessary to have two absolutely identical servers. After a failure on the first server, you will have some time to analyze and restore the operation of the main node. The second node will work for the reliability of the system as a whole. However, this does not mean that the second server will be idle. Both nodes of the cluster can calmly go about their business, solve different problems. But we can set up a certain critical resource to work in a cluster, increasing its (this resource) fault tolerance.

Network settings requirements

Unique NetBIOS name for the cluster.
Five unique static IP addresses. Two for network adapters per cluster network, two for network adapters per network, and one for the cluster.
Domain account for the Cluster service.
All cluster nodes must be either a member server in the domain or domain controllers.
Each server must have two network adapters. One for connecting to a common network (Public Network), the second for data exchange between cluster nodes (Private Network).

Comment: According to Microsoft recommendations, your server should have two network adapters, one for the general network, the second for data exchange within the cluster. Is it possible to build a cluster on one interface - probably yes, but I have not tried it.

Installing a cluster

When designing a cluster, you must understand that by using the same physical network for both cluster communication and LAN, you increase the failure rate of the entire system. Therefore, it is highly desirable for cluster data exchange to use one subnet allocated as a separate physical network element. And for the local network, you should use a different subnet. Thus, you increase the reliability of the entire system as a whole.

In the case of building a two-node cluster, one switch is used by a common network. Two cluster servers can be connected directly with a cross cable, as shown in the figure.

Installing a 2-node cluster can be divided into 5 steps

Installing and configuring nodes in a cluster.
Installing and configuring a shared resource.
Check disk configuration.
Configuring the first cluster node.
Configuring the second node in the cluster.

This step-by-step guide will help you avoid mistakes during installation and save a lot of time. So, let's begin.

Installing and configuring nodes

We will simplify the task a little. Since all cluster nodes must be either domain members or domain controllers, we will make the 1st cluster node the root holder of the AD (Active Directory) directory, and the DNS service will run on it. The 2nd node of the cluster will be a full domain controller.

I am ready to skip the installation of the operating system, believing that you should not have any problems with this. But I would like to explain the configuration of network devices.

Network settings

Before starting the installation of the cluster and Active Directory, you must complete the network settings. I would like to divide all network settings into 4 stages. To resolve names on the network, it is desirable to have a DNS server with pre-existing records about the servers in the cluster.

Each server has two network cards. One network card will serve to exchange data between cluster nodes, the second will work for clients in our network. Accordingly, the first one will be called Private Cluster Connection, the second one will be called Public Cluster Connection.

The network adapter settings for one and the other server are identical. Accordingly, I will show how to configure the network adapter and give a plate with the network settings of all 4 network adapters on both cluster nodes. To configure the network adapter, follow these steps:

My Network Places → Properties

Private Cluster Connection → Properties → Configure → Advanced

This point needs some explanation. The fact is that according to the strong recommendations of Microsoft, all network adapters of cluster nodes should be set to the optimal speed of the adapter, as shown in the following figure.

Internet Protocol (TCP/IP) → Properties → Use the following IP: 192.168.30.1
(For the second host, use 192.168.30.2). Enter the subnet mask 255.255.255.252 . Use 192.168.100.1 as the DNS server address for both hosts.
Additionally, on the Advanced → WINS tab, select Disabled NetBIOS over TCP/IP. For settings of network adapters of the public (Public) network, omit this item.
Do the same with the NIC for the Public Cluster Connection LAN. Use the addresses given in the table. The only difference in the configuration of the two NICs is that Public Cluster Connection does not require WINS - NetBIOS over TCP/IP to be disabled.

Use the following table to configure all network adapters on cluster nodes:

Knot	Network name	IP address	MASK	DNS Server
1	Public Cluster Connection	192.168.100.1	255.255.255.0	192.168.100.1
1	Private Cluster Connection	192.168.30.1	255.255.255.252	192.168.100.1
2	Public Cluster Connection	192.168.100.2	255.255.255.0	192.168.100.1
3	Private Cluster Connection	192.168.30.2	255.255.255.252	192.168.100.1

Installing Active Directory

Since my article does not aim to talk about installing Active Directory, I will omit this point. There are quite a lot of recommendations and books written about this. Choose a domain name like mycompany.ru, install Active Directory on the first node, add the second node to the domain as a domain controller. When you're done, check your server configurations, Active Directory.

Installing a Cluster User Account

Start → Programs → Administrative Tools → Active Directory Users and Computers
Add a new user, for example ClusterService.
Check the boxes for: User Cannot Change Password and Password Never Expires .
Also add this user to the administrators group and give him the Log on as a service rights (the rights are assigned in the Local Security Policy and Domain Controller Security Policy).

Setting up an external data array

To set up an external data array in a cluster, you must remember that before installing the Cluster Service on the nodes, you must first configure the disks on the external array, only then install the cluster service first on the first node, only then on the second. If you violate the installation order, you will fail, and you will not reach the goal. Can it be fixed, probably yes. When an error occurs, you will have time to correct the settings. But Microsoft is such a mysterious thing that you don’t know at all what kind of rake you will step on. It’s easier to have step-by-step instructions in front of your eyes and remember to press the buttons. Step by step, configuring an external array looks like this:

Both servers must be turned off, external array turned on, connected to both servers.
Turn on the first server. We get access to the disk array.
We check that the external disk array has been created as Basic. If this is not the case, then we will transfer the disk using the Revert to Basic Disk option.
We create a small partition on an external drive through Computer Management → Disk Management. According to Microsoft recommendations, it should be at least 50 MB. I recommend creating a 500MB partition. or a little more. This is quite enough to accommodate clustered data. The partition must be formatted in NTFS.
On both nodes of the cluster, this partition will be named with the same letter, for example, Q. Accordingly, when creating a partition on the first server, select the item Assign the following drive letter - Q.
You can mark the rest of the disk as you wish. Of course, it is highly desirable to use the NTFS file system. For example, when configuring DNS, WINS services, the main service databases will be transferred to a shared disk (not the Q system volume, but the second one you created). And for security reasons, it will be more convenient for you to use NTFS volumes.
Close Disk Management and check access to the newly created partition. For example, you can create a text file test.txt on it, write it down and delete it. If everything went well, then we are done with the configuration of the external array on the first node.
Now turn off the first server. The outer array must be enabled. We turn on the second server and check access to the created partition. We also check that the letter assigned to the first section is identical to the one we chose, i.e. Q.

This completes the external array configuration.

Installing Cluster Service Software

Configuration of the first cluster node

Before starting the installation of Cluster Service Software, all cluster nodes must be turned off, all external arrays must be turned on. Let's move on to the configuration of the first node. The external array is up, the first server is up. The entire installation process takes place using the Cluster Service Configuration Wizard:

Configuration of the second node of the cluster

To install and configure the second cluster node, the first node must be enabled and all network drives must be enabled. The procedure for setting up the second node is very similar to the one I described above. However, there are some minor changes. To do this, use the following instruction:

In the Create or Join a Cluster dialog, select The second or next node in the cluster and click next.
Enter the cluster name we set earlier (in the example it is MyCluster) and click next.
After connecting the second node to the cluster, the Cluster Service Configuration Wizard will automatically pick up all settings from the primary node. To start the Cluster Service, use the name we created earlier.
Enter your account password and click next.
In the next dialog box, click Finish to complete the installation.
Cluster service will be started on the second node.
Close the Add/Remove Programs window.

To install additional cluster nodes, use the same instructions.

Postscript, thanks

In order not to get confused with all the stages of installing a cluster, I will give a small table that reflects all the main stages.

Step	Node 1	Node 2	External array

First of all, decide what components and resources will be required. You will need one master node, at least a dozen identical compute nodes, an Ethernet switch, a power distribution unit, and a rack. Determine the amount of wiring and cooling, as well as the amount of space you need. Also decide what IP addresses you want to use for the nodes, what software you will install and what technologies will be required to create parallel computing power (more on this below).

Although the hardware is expensive, all of the software in this article is free, and most of it is open source.
If you want to know how fast your supercomputer could theoretically be, use this tool:

Mount the nodes. You will need to build hosts or purchase pre-built servers.

Choose server frames that make the most efficient use of space and energy, as well as efficient cooling.
Or you can "recycle" a dozen or so used servers, a few outdated ones - and even if their weight exceeds the total weight of components, you will save a decent amount. All processors, network adapters, and motherboards must be the same for computers to work well together. Of course, don't forget RAM and hard drives for each node, and at least one optical drive for the master node.

Install the servers in the rack. Start at the bottom so the rack isn't overloaded at the top. You will need a friend's help - assembled servers can be very heavy, and it is quite difficult to put them in the cells on which they are supported in the rack.

Install an Ethernet switch next to the rack. It's worth configuring the switch right away: set the jumbo frame size to 9000 bytes, set the static IP address you chose in step 1, and turn off unnecessary protocols such as SMTP.

Install a power distribution unit (PDU, or Power Distribution Unit). Depending on the maximum load the nodes on your network put out, you may need 220 volts for a high performance computer.

When everything is set, proceed to the configuration. Linux is in fact the go-to system for high-performance (HPC) clusters - not only is it ideal for scientific computing, but you also don't have to pay to install a system on hundreds or even thousands of nodes. Imagine how much it would cost to install Windows on all nodes!

Start by installing the latest motherboard BIOS and vendor software, which should be the same for all servers.
Install your preferred Linux distribution on all nodes and the GUI distribution on the master node. Popular systems: CentOS, OpenSuse, Scientific Linux, RedHat and SLES.
The author highly recommends using Rocks Cluster Distribution. In addition to installing all the necessary software and tools for the cluster, Rocks provides an excellent method for quickly "porting" multiple copies of the system to similar servers using PXE boot and Red Hat's "Kick Start" procedure.

Install the message passing interface, resource manager, and other required libraries. If you didn't install Rocks in the previous step, you'll have to manually install the required software to set up the parallel computing logic.

To get started, you'll need a portable bash system, such as Torque Resource Manager, which allows you to split and distribute tasks across multiple machines.
Add Maui Cluster Scheduler to Torque to complete the installation.
Next, you need to set up a message passing interface, which is necessary for the individual processes in each individual node to share data. OpenMP is the easiest option.
Don't forget about multi-threaded math libraries and compilers that will "assemble" your programs for distributed computing. Did I already say that you should just install Rocks?

Connect computers to the network. The master node sends tasks for calculation to slave nodes, which in turn must return the result back, and also send messages to each other. And the sooner this happens, the better.

Use a private Ethernet network to connect all nodes in a cluster.
The master node can also act as an NFS, PXE, DHCP, TFTP and NTP server when connected to Ethernet.
You must separate this network from the public network to ensure that packets are not overlapped by others on the LAN.

Test the cluster. The last thing you should do before giving users access to computing power is performance testing. HPL (High Performance Lynpack) benchmark is a popular option for measuring the speed of computing in a cluster. You need to compile software from source with the highest degree of optimization that your compiler allows for the architecture you have chosen.

You must, of course, compile with all possible optimization settings that are available for the platform you have chosen. For example, if using an AMD CPU, compile to Open64 with an optimization level of -0.
Compare your results with TOP500.org to compare your cluster with the 500 fastest supercomputers in the world!