Best Practices for Data Repository database storage configuration

Products

CA Performance Management Network Observability

Issue/Introduction

Our implementation of DX NetOps Performance Management has a Data Repository with a three node cluster. The storage team moved the storage platform used by the Data Repository for data and catalog directories from Symmetrix & Clariion to PURE using Pure Flash Array. The model in this situation is FA-X70R2.

Reviewing I/O numbers from the vioperf script before and after the move we can see the I/O performance after the move is quite below what it was before the move.

When engaging the storage team they have questions about how to configure the PURE Storage Platform to meet the Vertica I/O requirements.

Is there any documentation or best practices in setting up storage for Vertica?

I’m not familiar with the I/O configuration for Vertica so if anyone knows the block size and maybe the maximum I/O size it’s doing that may help us too. According to the PURE best practice documentation there are a few things that could be changed. Would these changes help?

I/O scheduler is currently set to “deadline” which means the system splits the I/O into read and write queues. PURE recommends “noop” which effectively turns off the scheduler.
The max_sectors_kb variable is set to 512 on all disks. PURE can handle up to 4 MB and their suggestion is setting this to 4096 on all disks
Several things within the MPIO configuration including setting the path_selector variable to "queue-length 0" and the path_grouping_policy variable to “group_by_prio” among others

Environment

All supported Performance Management releases

Cause

Changing storage engine behind the Data Repository database data and catalog directories.

Resolution

The following documents the various changes and recommendations to improve I/O in this storage change scenario using PURE storage. One or more of these changes might be needed to resolve similar I/O problems when using the PURE storage solution.

Is there a way to compare numbers before and after the storage change?

One option to see what changes are needed, or what differences there are before and after the storage change, is the dr_validate script and it's logs.
- Run the script after the storage change.
- Compare it's logs for the current config to the ones run at the latest upgrade/install cycle. What is different?
- Do things improve if the new systems are configured to match the old?
- The logs are found in (default path) /opt/CA/IMDataRepository_vertica<version>/logs.
The new install doc pages for the DR system have a good review of the things validate will change/check. Its found here:
- https://techdocs.broadcom.com/content/broadcom/techdocs/us/en/ca-enterprise-software/it-operations-management/performance-management/3-7/installing/install-the-data-repository.html
If vioperf data is available from before the storage change, run it again after the change. Compare the before and after results. Is it better or worse?

What other configurations might be made or recommended to help improve I/O?

In relation to fast NVMe drives the FlashArray configuration is slower.
Ensure the scheduler settings and other configs like readahead and max_sectors_kb are correct on all nodes.
- Vertica suggests that noop is the better choice than deadline in this situation.
- This is from the Vertica docs where we see both are acceptable.
  - "This topic details how to change I/O Scheduling to a supported scheduler. Vertica requires that I/O Scheduling be set to deadline or noop."
  - https://www.vertica.com/docs/9.1.x/HTML/index.htm#Authoring/InstallationGuide/BeforeYouInstall/IOScheduling.htm
- In this scenario, with it configured to deadline while observing poor performance, changing the scheduler to noop improves the I/O.
- Additional information regarding the noop setting:
  - "The NOOP scheduler uses a simple FIFO approach, placing all input and output requests into a single queue. This scheduler is best used on solid state drives (SSDs). Because SSDs do not have a physical read head, no performance penalty exists when accessing non-adjacent sectors."
- Using the following command should set the disk(s) involved to use noop.
  - echo "noop" /sys/block/<DEVICENAME>/queue/scheduler
- It may require a reboot to implement the change.
- Further information for RedHat Linux can be found here:
  - https://access.redhat.com/solutions/109223
- It is also worth checking /etc/rc.local for lines that reference deadline for the same <DEVICENAME> used in the echo command.
  - Update any found to noop for the same <DEVICENAME>.
- It is very important that the cluster nodes are configured with the same CPU core count, and the same amount of RAM.
- In this scenario one node was configured with 32 CPU cores via hyperthreading (HT) configurations.
  - The other two nodes had HT disabled and were using 16 CPU cores.
  - The 32 core nodes provided better I/O than the 16 core node.
- Engineering recommends that if the desire is to have HT enabled on all nodes ensure vioperf shows good numbers when running with 32 cores vs 16.
- Running at 32 cores via HT depends on overall memory for system, and vioperf results at 32 threads writing/reading at same time.
  - Memory is key to give each query a good allotment of memory to run initially.
  - The higher the CPU count, the smaller it is if using AUTO for plannedconcurrency.
  - For example if RIB is 10% with plannedconcurrency of AUTO with 64 GB RAM it means 6.4GB is reserved for RIB. Divide that by plannedconcurrency:
    - 6.4GB / 32 cpu = 200MB/query
    - 6.4GB / 16 cpu = 400MB/query
  - Vertica may use a number less than 32 depending on overall memory, but that example is worst case scenario.
- Ensure the boot device is properly configured for multipath. Below are sample parameters for the multipath configuration.

defaults {
user_friendly_names yes
polling_interval 10
find_multipaths yes

devices {
device {
vendor "PURE"
path_selector "queue-length 0"
path_grouping_policy group_by_prio
path_checker tur
fast_io_fail_tmo 10
dev_loss_tmo 60
no_path_retry 0
hardware_handler "1 alua"
prio alua
failback immediate

What size I/O is the vioperf using to perform it's tests?

Vertica asks for readahead of 2048 and blocksize of 4096 to operate properly. It would be using 4096 as the value for vioperf performance checks.

Additional Information

If changes are needed best practice is to:

Bring down one node at a time, make the changes and restart that one node.
Repeat the process for each node stopping them one at a time.
This allows the changes to be implemented without a DB outage in properly configured cluster installations.

Vertica Hardware Guide: https://www.vertica.com/kb/GenericHWGuide/Content/Hardware/GenericHWGuide.htm