Reading the CB Qualifier Storage Drive Utility Results
book
Article ID: 377473
calendar_today
Updated On:
Products
Carbon Black EDR (formerly Cb Response)
Issue/Introduction
How to read the results from the CB Qualifier tool output.
Environment
Carbon Black EDR Server: All Versions
CB Qualifier
Resolution
Overview of the Tests
Ingress
"Test for level ingress load"
Ingress is the basic test, can the storage drive handle the incoming event load and searching?
If this test fails in any of the these areas, the storage drives will need to be upgraded. This is considered an absolute failure.
Optimize_Seperate
"Test to see if optimize can run on a disk without ingress load"
Can the storage drive handle the re-indexing of data when optimizing Solr cores into less segments.
Cores are optimized at midnight after each rollover. Optimization is a re-indexing of the data into lower segments. Think of this like disk defragmentation, by lowering the segment size performance is optimized for searching.
If this test fails, the storage drive should be upgraded.
Slow optimization will result in performance issues, impacting ingestion, searching and alerting.
Optimize_Same
"Test to see if optimize can run on a disk with ingress load"
This is a combination of both ingress and optimization at the same time. Can the storage drive handle ingesting, reading and optimizing at the same time?
If this test fails, the environment needs to be taken into account.
If the majority of sensors are online 24/7, the storage drives will need to be updated
If the majority of sensors and users are online only during working hours, this may be safe to ignore as it is less likely ingress, searching and optimization will be at full capacity at the same time. Consider this acceptable if the threshold is not far off.
Reading Results
Focus on the following sections
Test
Meaning
read_bandwidth
Total number of bytes transferred / total time between first request for service and completion of last transfer
read_iops
Input/Output Operations per second. An SSD is going to have much higher IOPS than a spinning disk.
IOPS = 1000 ms / (average seek time + average latency)
Average Latency - How long it takes a disk to spin halfway around.
0.1ms for SSD
((60 / <half spin time>)/2)*1000
Average Seek - How long it takes for the head to reach the area of the disk where the data is stored. Average of write and write seek or read and read seek.
write_bandwidth
Total number of bytes transferred / total time between first request for service and completion of last transfer
write_iops
Input/Output Operations per second. An SSD is going to have much higher IOPS than a spinning disk.
IOPS = 1000 ms / (average seek time + average latency)
Average Latency - How long it takes a disk to spin halfway around.
0.1ms for SSD
((60 / <half spin time>)/2)*1000
Average Seek - How long it takes for the head to reach the area of the disk where the data is stored. Average of write and write seek or read and read seek.
Each section contains the following results:
Field
Meaning
Pass
True/False, did this particular test pass or fail?
Result
The actual performance of the disk on this specific test
Threshold
The expected passing threshold based on the configurations set by EDR engineering.
Evaluating the Results
If read/write iops and read/write bandwidth pass on each test, the storage drive will be able to handle the load.
Pre-data tests are done at best guestimate based on normal event noise from sensors. In environments where endpoints are more noisy, a passing score close to the threshold may not hold true as live data comes in.
If only tests on optimize_same fail, the storage device can pass if sensors and users are not online 24/7. Optimization runs at midnight every night a new core is rolled over.
A second volume can be mounted as /var/cb/data/solr/cbevents2, this allows the newly rolled over core to optimize on a separate partition as the new writer core. Supporting Multiple Volumes for Event Data
If any tests from Ingress or Optimize fail the storage device is not fast enough to handle the load.
If there is a single failed test, yet the result is very close to the threshold, the storage may be adequate to handle the load but will need to be upgraded if performance issues occur.
Read the "run_config: fail_message" for more detail.
Additional Information
The run_config section contains both fail_message and pass_message. This section is just the configuration settings, only the pass: field in each test should be viewed.