If the primary hub cannot start and you are unable to connect to it at all, but the Services appear to be up and running, this may be due to missing probe definitions in the controller.cfg in $NIMROOT/robot directory:
This issue can happen if for example, the server crashed while the controller.cfg was still locked by the Nimsoft Robot watcher process, and became corrupted or truncated. On reboot, the robot watcher service then recreated the file but it is then only containing the controller probe entry. This then causes none of the other core probes to start, and shows the hub as not started up.
Symptoms Include (but are not limited to):
Upon examination of the controller.cfg file it will be noted that one or more probes are not listed as expected - this file should contain one entry for each probe installed on the robot, but in this case it may contain only 1-3 probes and the entry for <hub> will be missing along with many/most other probes.
Replace the corrupted controller.cfg with a copy from any other hub which should be sufficient to get the hub started even if the probes are not 100% identical.
(In the zip file attached to this document you will find a sample_controller.cfg which may be used as the bare minimum to get the hub started - you may use this if you do not have a backup copy. The file contains a sample for a Primary Hub and non-Primary hub for both Linux and Windows.)
If you have used a backup copy you will need to go through it and remove all lines from the file which start with 'magic_key' (remove the entire line which contains the magic_key entry). Most probes will have a magic_key entry associated with them and you will need to remove all of these lines from the controller.cfg file.
After you remove the magic_key entries, save the file. Now Move (don't copy) controller.cfg to $NIMROOT\robot\changes folder.
On Linux, an easy way to accomplish this entire step with a single command is as follows:
grep -v magic_key /opt/nimsoft/robot/controller.cfg > /opt/nimsoft/robot/changes/controller.cfg
After you execute this command, delete the controller.cfg from /opt/nimsoft/robot.
On Windows, you could use a text editor like Notepad++ with advanced search-and-replace features to delete all the magic_key entries/lines.
Restart the robot watcher service.
Login to the secondary hub using Infrastructure Manager
On the primary hub, only the controller will be up and it will automatically attach to the nearest/secondary hub as it is running as a robot only. If it doesn't show up and attach to the secondary hub, use Connect Robot tool in Infrastructure Manager to attach the primary hub robot to the secondary hub.
Once, you have the primary hub robot attached to secondary hub, validate the hub probe by right clicking on the hub probe and then select Security->Validate. After the hub probe is active it will detach from secondary hub and will take the hub role. validate the other probes in the same way as you just did with the hub probe.
Launch a command prompt and navigate to the $NIMROOT\hub folder and execute this command: hub.exe -d3 -lstdout
Leave this running in the command window (this will launch the hub so you can log in), do not close the command prompt.
Launch Infrastructure manager and login to the hub.
Right click on the hub probe, choose "Security" then "Validate".
Login to Infrastructure Manager
Validate any/all red-lock icon probes, if any on the hub
Go into the folder $NIMROOT\Nimsoft\probes and in here you will see several subfolders (e.g. application, system, slm, etc).
Go into each subfolder and one-by-one, look at the folder names which represent the probes which are already installed on the system.
For each probe which you find under each subfolder, make sure the corresponding probe is displayed in Infrastructure Manager.
If a probe is missing, locate that probe in the Archive, and re-deploy it to the hub. The existing configuration will be preserved.
If there are any probes which show up in Infrastructure Manager but do NOT have a corresponding folder, you can right-click and delete them in IM.
The following Powershell Script can be used in Windows to identify all the probes which are already installed under the /probes/ folder. You can run this script to get a list of the probes which need to be re-deployed.
# change the below line if your installation path is different
$baseDir = "C:\Program Files (x86)\Nimsoft\probes"
$subDirs = Get-ChildItem -Path $baseDir -Directory | ForEach-Object {
Get-ChildItem -Path $_.FullName -Directory | ForEach-Object {
$_.Name
}
}
$subDirs | ForEach-Object { Write-Output $_ }
The following is a Linux/bash script which will accomplish the same thing for a Linux hub installation:
#!/bin/bash
# change the below line if your installation path is different
base_dir="/opt/nimsoft/probes/"
# Find directories one level below base_dir
find "$base_dir" -mindepth 2 -maxdepth 2 -type d | while read -r dir; do
# Extract the last subdirectory name
basename "$dir"
done