About choosing the number of query engines to install

book

Article ID: 179571

calendar_today

Updated On:

Products

Control Compliance Suite Windows

Issue/Introduction

 

Resolution

About choosing the number of query engines to install

The number of query engines that you install for use with bv-Control for Windows is dependent on the amount of information that you collect. The amount of information depends on the number of targets, the frequency, and the scope of the queries.

No single deployment strategy can apply to every situation and budget. You can follow some general guidelines for how many query engines should be installed and where they should be located. In specific scenarios, the administrators should consider customizing the deployment of query engines and agents.

You must consider certain factors while determining the placement, quantity, and configuration of query engines.

The most important factors to be considered before you deploy query engines in a particular environment are as follows:

  • Type and quantity of queries

  • Geographic locations

  • Performance expectations

Directory-based queries do not need to take advantage of the distributed architecture because the Master Query Engine handles these queries.

The following describes the load class of a typical machine query:

Light

OS version and configuration information, local user and group information, and service information. Specific registry keys or values with appropriate scopes. Specific file information specifically scoped, and volume information.

Moderate

Registry searches, file searches within a moderate scope, log file searches through a small log file or small span of event log time.

Heavy

Full file system searches for specific files, file ownership, disk space analysis by user or group. Log file searches through large files or large amounts of time, and file system DACL searches.

Specialized, potentially extra heavy

Patch assessment and Effective permission.


Geographic locations refer to the relationship between the query engine and the target computer.

The geographic locations are defined as follows:

Local

Target and agent on the same campus with 10 MB/s or faste network connection between them.

Regional

High-speed connection between the remote sites that may be burdened, or the connection has moderate to high latency.

Remote

Low speed connection between remote locations or high latency, or both, such as satellite links.


In certain scenarios, the load class is light and the number of targets across each distant link is more than twenty. For such scenarios, a query engine should be placed at each remote location. If the load class is increased to moderate or beyond, a remote query engine is recommended. This strategy lets the remote location perform as if local.

In regional installations, conditions may dictate at least one query engine in the regional location.

You may need a query engine in the regional location if a large number of targets are in the regional location. A large number of targets causes an increase in the Data Collection Agent (DCA) count on a corporate-based query engine. In turn, the large count stresses the network link. The large number of targets can degrade query performance and impact other remote communications.

You may also need a query engine in the regional location even if the location has a small number of targets. If each target returns large volumes of information from heavy load class queries, a dedicated query engine is needed. By placing a query engine at the remote location, the majority of the communication is local between the query engine and the target computers.

Based on the placement guidelines, the next factor to consider is the ratio of targets to agents. For these scenarios, an agent is a single DCA.

The default query engine is set to the following concurrent agents:

Light Load Class Queries

The ratio of targets to agents can be high, 100-plus. This ratio translates to 600-plus targets for one query engine in a default installation.

Moderate Load Class Queries

The ratio should be restricted to between 20 and 60. This ratio translates to 120 - 360 targets per query engine.

Heavy Load Class Queries

The ratio should be less than 5. The lower, the better. For a default installation, the ratio should be 30 targets per query engine. This ratio may not provide adequate performance on all platforms. If performance is not adequate, adjust downward accordingly.

Specialized Load Class Queries

Patch Assessment queries are multithreaded with 16 threads per agent. The default agent count of 6 times 16 threads translates into 96 concurrent targets assessed. A rough estimate is 5 minutes per round of 96 target computers with a default query engine for complete patch assessment. This ratio translates into a ratio of 100 targets per agent or 600 targets per query engine for adequate performance.


The default configuration of 6 agents per query engine balances the impact between the host computer and performance.

In the event of dedicated query engines, this number can be raised to increase performance with the following considerations:

  • If there are no distribution rules in place on the Master Query Engine, all query engines in a domain are given equal work. A higher agent count on one query engine may allow that query engine to complete its work faster. The overall performance of the query remains constant. Use the View Distribution Rules Results option in bv-Config to determine the number of targets that are assigned to each query engine. You can then adjust the agent count accordingly.

  • For all load class queries except effective permissions, the query engine is memory bound. The CPU and network performance should not be compromised. If the agent count is increased to the point that memory swaps occur, a performance decrease is observed instead of a performance increase. Use a rough estimate of 20 MB of RAM for each configured DCA except for the Specialized load class of queries. Suppose a query engine handles Light load class queries and the agent count is increased to 60. In this case, the system should have at least 1.5 GB of RAM.

  • For Specialized load class queries, the Patch Assessment queries consume more memory than other load classes. Estimate 30 MB of RAM for each agent for these queries.

  • For Effective Permissions reporting, the load that is placed on the agent is both CPU and memory intensive. If these reports are run in environments with tens of thousands of users, allow an additional 10 MB of RAM per agent per 10,000 users. For CPU load, these queries take advantage of multiple CPUs. Do not try to burden a query engine with more than 4 to 6 agents or even fewer, depending on the Analysis options.

  • For Password Analysis queries, the load that is placed on the agent is primarily CPU intensive. Password Analysis queries that use a domain as the scope are run on only a single processor. The number of processors in the Master Query Engine has no affect on the time the query requires to complete.

Administrators can reconfigure the number of agents a query engine uses from a minimum of one to a maximum of sixty.

This ratio can be adjusted to accommodate specific environmental needs or preferences, including the following:

  • Preference for lower number of query engine installations

  • Availability of dedicated computers or high-powered computers

  • Use of low-powered computers

Higher numbers of agents on a query engine increases its resource usage in terms of memory, CPU cycles, hard disk space, and network traffic. Administrators who have the option of using dedicated servers for query engine deployment can increase the number of agents per query engine. Administrators who have the high-powered servers that can host the query engines can also increase the number of agents per query engine. The administrators can reduce the number of SQEs that they must install and maintain by increasing the number of agents per query engine. To handle special scenarios, larger numbers of agents per query engine may not always be a solution. You must deploy query engines to handle special scenarios.

If administrators must use less powerful computers to host SQEs, they can reduce the number of agents per SQE and install more SQEs. Fewer SQEs may also affect the fault tolerance of the system.

Active Directory and Domain queries are handled exclusively by agents from the MQE. Local users and groups are treated as machine queries. In addition, machine and IP queries are also treated as machine queries. User and group caches are not enabled by default. Domains with more than 5000 users can turn on user caching to improve the performance on user queries. Use of user and group caches lets the MQE maintain a cache of some user and group information. This information is updated periodically at the intervals that the administrator defines. When the cache option is enabled, all the queries for the information that is found in the cache are processed from the cache.

Windows computers that are not part of a domain can be queried by installing an MQE. The MQE should have its SQE configured for a single agent on each computer that is not part of a domain. Queries against these computers must use the local MQE. The Local System account is used for stand-alone and workgroup installations, and a service account is not required. These computers can be grouped in a query by using a scope file with the computers listed.

The ports that are used for default communications between bv-Control for Windows components are typically closed in firewall installations. To assist deployment in the networks that the firewalls protect, the components can be configured to communicate through firewalls. These communication configurations can be made by using the ports that are specified during installation or post-installation. The ECS, MQE, and SQE can be configured to use a specified port number. The use of specific port numbers lets the Information Server component be configured to communicate with the ECS and MQE using the specified ports. MQEs can be configured to communicate with the ECS using the specific port. Also, bv-Config can be configured to communicate with the ECS using the specific port. The Console component-to-Information Server component communications cannot operate through a firewall. Some communications cannot operate through a firewall like MQE to support service, and agent to target computer.

Query engines are relatively easy to add to or remove from your deployment. You should feel free to experiment to determine the number of query engines that your deployment requires.