Run Hardware Diagnostic tests
Most servers are shipped with a hardware diagnostics CD, although other hardware vendors may choose to install a hidden utility partition located on your hard drive.
Note: If you are not experienced with computers or have any concerns, please contact your hardware vendor.
You can diagnose hardware related problems on your server by booting from the diagnostic CD or choosing Diagnostics from the boot device list.
These diagnostic tools allow you to:
-
Check the hardware configuration and verify that it is functioning correctly.
-
Test individual hardware components.
-
Diagnose hardware-related problems.
-
Obtain a complete hardware configuration.
When testing, if a component failure is detected, make note of any error code(s) and contact the hardware vendor.
Note: This diagnostic will not be able to detect the hardware fault unless it occurs during the test, so it must be run for quite some time.
Check your memory
Note: This process requires downtime on your ESX/ESXi host for up to 48 hours. In most cases, contacting your hardware vendor for a diagnostic utility as mentioned above should be sufficient in testing your hardware. Broadcom does not endorse or recommend any particular third party utility. However, there are third party options available to test your memory.
To test your memory:
-
-
Extract the ISO image from the .gz or .zip archive.
-
Burn the image to CD.
-
Boot your ESX/ESXi host from the CD.
-
The memtest goes through each memory bank and checks for errors. Run the tool for several hours, at least until it starts pass 2, to ensure the full suite of tests have been executed.
Note: If memtest86+ does not run on your hardware, contact your vendor for their memory test utility.
Ensure your server configuration conforms to Non-Uniform Memory Access (NUMA) specifications
Notes:
- If you are not experienced with computers or have any concerns, please contact your hardware vendor.
- Problems related to NUMA usually occur following a RAM upgrade or after an ESX/ESXi Server host installation.
You might see the following error:
The BIOS reports that NUMA node 1 has no memory. This problem is either caused by a bad BIOS or a very unbalanced distribution of memory modules.
NUMA is a system where each processor has separate memory. The separate memory helps to avoid a performance hit when several processors attempt to address the same memory.
The main requirement is that a similar amount of memory is installed beside each processor. If the amount of memory installed beside each processor is not similar, it is unbalanced and you might experience performance problems.