System was running so slow it was timing out when generating any reports or dashboards. Groups & Devices also not appearing.
DX NetOps CAPM all currently supported releases
Analysis showed that even though the MySql netqosportal DB & tables were of moderate size, they were taking excessively long to carry out various RIB SELECT queries. For example, to see which queries were executing, ran the following query every few minutes:
select * from information_schema processlist where command <> 'Sleep' and INFO is not null;
This showed the following queries which running non-stop for hours:
| ID | USER | HOST | DB | COMMAND | TIME | STATE | INFO
+----+--------+-------------+--------------+---------+------+-----------+------------------------------------------------| 32 | netqos | portal.host | netgosportal | Query. | 1600 | executing | SELECT ii.ItemID, tmp.LocalID FROM item_identifiers ii JOIN item_id_map_work tmp WHERE ii.SourceID = 3 AND ii.LocalID = tmp.LocalID| 33 | netqos | portal.host | netgosportal | Query | 1593 | executing | SELECT ii.ItemID, tmp.LocalID FROM item_identifiers ii JOIN item_id_map_work tmp WHERE ii. SourceID = 3 AND ii.LocalID = tmp.LocalID| 26 | netqos | portal.host | netgosportal | Query | 1601 | executing | SELECT ii.ItemID, tmp.LocalID FROM item_identifiers ii JOIN item_id_map_work tmp WHERE ii. SourceID = 3 AND ii.LocalID = tmp.LocalID| 35 | netgos | portal.host | netgosportal | Query | 1425 | executing | SELECT ii.ItemID, tmp.LocalID FROM item_identifiers ii JOIN item_id_map_work tmp WHERE ii. SourceID = 3 AND ii.LocalID = tmp.LocalID| 29 | netgos | portal.host | netgosportal | Query | 1593 | executing | SELECT ii.ItemID, tmp.LocalID FROM item_identifiers ii JOIN item_id_map_work tmp WHERE ii. SourceID = 3 AND ii.LocalID = tmp.LocalID| 21 | netgos | portal.host | netgosportal | Query | 1425 | executing | SELECT ii.ItemID, tmp.LocalID FROM item_identifiers ii JOIN item_id_map_work tmp WHERE ii. SourceID = 3 AND ii.LocalID = tmp.LocalID
So the issue was that the item_identifiers table (netqosportal DB) which the above queries were running on and taking forever, was either heavily fragmented or disjointed.
To resolve this, we first stopped the PC services:
systemctl stop caperfcenter_consolesystemctl stop caperfcenter_devicemanagersystemctl stop caperfcenter_eventmanagersystemctl stop caperfcenter_ssosystemctl stop mysql
Then restarted mysql and logged in and truncated the item_identifiers table (which empties it completely):
mysql> truncate table netgosportal.item_identifiers;
Query OK, 0 rows affected (0.46 sec)
NOTE: This should only be done under guidance from Broadcom Support as it will empty the table of all data. If the issue is with another table (as opposed to item_identifiers, then the data maybe more critical)
Then, we optimised it (which defragments the physical storage space the table uses so that it is not disjointed and allows I/O read/write operations to occur much quicker):
mysql> optimize table netgosportal item_identifiers;+-------------------------------|----------+-----------+------------------------------------------------------------------+| Table | optimize | Msg_type | Msg_text+-------------------------------|----------+-----------+------------------------------------------------------------------+| netgosportal.item identifiers | optimize | note | Table does not support optimize, doing recreate + analyze instead| netgosportal.item identifiers | optimize | status | OK+-------------------------------|----------+-----------+------------------------------------------------------------------+2 rows in set (32 min 6.84 sec)
After this restarted the services.
The Global Sync process completed fairly quickly and and initiated a full re-sync of the Spectrum Data source after which it became available. Reports & dashboards generated in a timely manner with devices & groups also appearing when their views were selected.