Adding indexed pairs or triplets to a database in Reporter
search cancel

Adding indexed pairs or triplets to a database in Reporter

book

Article ID: 165803

calendar_today

Updated On:

Products

Reporter

Issue/Introduction

  • I want to add a few of my own favorite combinations of pairs and triplets.
  • I found two or three reports that run dramatically slower with certain filters; how can I speed up these reports?
  • I never run out of memory or disk space, but some reports run extremely slow even though they eventually complete.

Resolution

The Reporter database is a proprietary, streamlined database designed with pre-defined indexed data created in pairs and triplet formations. Because the database can be quite large depending on your site's requirements and traffic, these indexes were created to optimize report generation for common combinations of data. These pair and triplet indexes are chosen in advance to ensure efficient report generation time for all pre-defined reports. In some cases, you can create your own report that does not match an index and experience a drop in report generation performance. This is a side effect of not using a pre-defined report.

Note: The Reporter database is not designed to create indexes beyond triplets. When requesting a triplet of data in a report, Blue Coat recommends adding a filter to speed up report generation time. But, as is the case with any database, report generation time can vary widely depending on the the size and diversity of your database. In most cases, as long as the hardware is sufficient, the diversity of data is key to report generation time. For example, in a report that asks for user and site information, the question is how many different users have been to how many different sites. This relationship between site and user of course tends to go up with database size.

The default list of Triplets and Pairs:

Note: This provides an example report for each pair and triplet.

Pairs:
  • cs_username  cs_host (Web browsing per site )
  • cs_username  sc_filter_category (Web Browsing per User and category)
  • cs_host sc_filter_category (Top Sites by Page views)
  • c_ip  cs_host (Top sites by page views, filtered on Client IP mask)
  • c_ip  sc_filter_category (Blocked Web sites)
  • cs_username  sc_filter_result (Web Browsing per User, filtered on a Verdict of  eg "Content filter Denied" )
  • cs_host   sc_filter_result (Site: Calculated by Requests filtered on a Verdict of  eg "Content filter Denied")
  • c_ip   sc_filter_result (Client IP: Calculated by Requests filtered on a Verdict of  eg "Content filter Denied" )
  • sc_filter_category cs_uri_scheme (Category: Calculated by Requests, Filtered on Protocol - HTTP, FTP, TCP )
  • sc_filter_category sc_filter_result (Category: Calculated by Requests, filtered on a Verdict of  eg "Content filter Denied" )
  • cs_auth_group  sc_filter_result (Group: Calculated by Requests , filtered on a Verdict of  eg "Content filter Denied" )
  • sc_filter_result cs_uri_scheme (Filtering Verdict by Trend by Day: Filtered on Protocol - HTTP, FTP, TCP)
  • x_virus_id  cs_username (ProxyAV malware Detected: Names, filtered on USER IS <username> )
  • x_virus_id  c_ip (Potential Malware Infected Clients )
  • sc_filter_category cs_auth_group (Category: Calculated by Page Views, filtered on Group is  <groupname>  )
  • cs_username cs_auth_group (Web Browsing per User: Calculated by Page Views, Filtered on Group IS <groupname>)
  • c_ip cs_auth_group (Client Ip: Calculated by Requests,  filtered on Group is  <groupname>  )
  • cs_host cs_uri_scheme (Top sites by page views, filtered on Filtered on Protocol - HTTP, FTP, TCP)
 
Triplets:
  • cs_username cs_host  sc_filter_category (Web Browsing per Category , filtered on Username is <username> and Category is <category name> )
  • c_ip  cs_hosts sc_filter_category (Site: Calculated by Requests, filtered on Client IP Matches Mask <CIDR mask> and Category is <category name> )
  • cs_auth_group cs_host  sc_filter_category (Site: Calculated by Requests, filtered on Site  is <domain name> and Category is <category name> )
  • cs_host  sc_filter_category  sc_filter_result (Top sites: Calculated b y Page Views filtered on Category is <category name> and verdict IS  <verdict name>
  • cs_uri_scheme cs_host  sc_filter_category (Site: Calculated by Requests, filtered on site is <domain name > , and category is <category name> )
  • cs_username cs_host  sc_filter_result (Web Browsing per user,  Calculated by request , filtered  on Site IS <Domain name> and username is <username> )
 Note: Where the report examples above mention a filter, any expression of IS, IS NOT, Matches mask and so on can be used.
 
If a custom report or a pre-existing report with an applied filter runs extremely slow, you might have selected a dataset combination that does not exist in the list above. To determine if this is true, identify the first three fields in your report and attempt to match them with the lists of pairs or triplets above. Assuming they don't match, you must follow the steps below to add the pair or triplet to the database indexes. 
 
Example: If you run the pre-defined report View browsing per user and Blocked web sites with a client-ip filter, they will run extremely slow because the pair and the triplet do not exist. 
They both  are defined as:
  •  username and client_ip
  •  site/client-ip/verdict
The following steps extend the database so they do exist.  Following these steps serve as an example for any pair and/or triplet you want to add to the Reporter database.

NOTE: Reporter provides the ability to customize database files. Follow this procedure in conjunction with this article to add pairs and triplets.

PROCEDURE
1: Create a new database, but make sure no logs are processed.  You can create the database without linking a log source to it.
 
2: Login and navigate to Administration > System Diagnostics. Record the number that represents the database that you created in step 1. You might have to use the side slide bar to move the graphic down so that you can match the database name with the number.
 
3: Stop the Reporter service. 
 
4: Navigate down through the file system to the folder where Reporter is installed.
  • 9.x: For Windows, the most likely location is:
    • installed Drive\Program Files\Bluecoat Reporter 9\
  • 9.x: For Linux, the most likely location is:
    • /opt/bc/reporter/
5: Find the settings/database folder and, using the name you record in Step 2, open the corresponding .cfg file.
 
6: In that file, find the section that similar to the following and scroll down through it to the number 23.
 
    multi_datasets = {
      0 = {
        field_one = "cs_username"
        field_two = "cs_host"
      } # 0
      1 = {
        field_one = "cs_username"
        field_two = "sc_filter_category"
      } # 1
      2 = {
        field_one = "cs_host"
        field_two = "sc_filter_category"
      } # 2
      3 = {
        field_one = "cs_username"
        field_two = "cs_host"
        field_three = "sc_filter_category"
      } # 3
      4 = {
        field_one = "c_ip"
        field_two = "cs_host"
      } # 4
      5 = {
        field_one = "c_ip"
        field_two = "sc_filter_category"
      } # 5
      6 = {
        field_one = "cs_username"
        field_two = "sc_filter_result"
      } # 6
      7 = {
        field_one = "c_ip"
        field_two = "cs_host"
        field_three = "sc_filter_category"
      } # 7
      8 = {
        field_one = "cs_host"
        field_two = "sc_filter_result"
      } # 8
      9 = {
        field_one = "c_ip"
        field_two = "sc_filter_result"
      } # 9
      10 = {
        field_one = "cs_auth_group"
        field_two = "cs_host"
        field_three = "sc_filter_category"
      } # 10
      11 = {
        field_one = "sc_filter_category"
        field_two = "cs_uri_scheme"
      } # 11
      12 = {
        field_one = "cs_host"
        field_two = "sc_filter_category"
        field_three = "sc_filter_result"
      } # 12
      13 = {
        field_one = "sc_filter_category"
        field_two = "sc_filter_result"
      } # 13
      14 = {
        field_one = "cs_uri_scheme"
        field_two = "cs_host"
        field_three = "sc_filter_category"
      } # 14
      15 = {
        field_one = "cs_auth_group"
        field_two = "sc_filter_result"
      } # 15
      16 = {
        field_one = "sc_filter_result"
        field_two = "cs_uri_scheme"
      } # 16
      17 = {
        field_one = "x_virus_id"
        field_two = "cs_username"
      } # 17
      18 = {
        field_one = "x_virus_id"
        field_two = "c_ip"
      } # 18
      19 = {
        field_one = "sc_filter_category"
        field_two = "cs_auth_group"
      } # 19
      20 = {
        field_one = "cs_username"
        field_two = "cs_auth_group"
      } # 20
      21 = {
        field_one = "c_ip"
        field_two = "cs_auth_group"
      } # 21
      22 = {
        field_one = "cs_host"
        field_two = "cs_uri_scheme"
      } # 22
      23 = {
        field_one = "cs_username"
        field_two = "cs_host"
        field_three = "sc_filter_result"
      } # 23
        } # multi_datasets
7: Edit the end of this section as follows.
Note: This example includes the last pair--#23--and appends the required pair and triplet.
 
      23 = {
        field_one = "cs_username"
        field_two = "cs_host"
        field_three = "sc_filter_result"
      } # 23
      24 = {
        field_one = "cs_username"
        field_two = "c_ip"
      } # 24
      25 = {
        field_one = "c_ip"
        field_two = "cs_host"
        field_three = "sc_filter_result"
      } # 25
    } # multi_datasets

8: Restart the Reporter service and process logs. If you did not add a log reader to the database in Step 1, add one now and allow time to process the logs into the database.
 
Note: If this is an existing database, you must re-import your database for these changes to occur throughout the database. Refer to the How do I fix database corruption? article.
 
Note: The following is the default list of fields that can be used to create pairs or triplets.
c_ip
sc_status
s_action
cs_method
cs_uri_scheme
cs_host
cs_username
cs_auth_group
rs_content_type
cs_user_agent
sc_filter_result
x_virus_id
sc_filter_category
proxy_ip
cs_uri_port
x_rs_certificate_observed_errors
x_rs_certificate_hostname
x_rs_certificate_hostname_category
x_rs_connection_negotiated_cipher_strength