The alarminfo table in our SRM implementation has not updated for one of our 14 active/enabled landscapes in over a year. All other 13 landscapes are updating as expected. The following mysql output shows this, the last set_time in the alarminfo table for 0x3100000 is 2024-01-17 22:13:03.
For info 0x3100000 is an active landscape with 2,549 devices monitored.
mysql> select max(set_time) max_time, hex(landscape_h) from alarminfo where set_time > '2023-01-01' group by landscape_h order by max_time;
+---------------------+------------------+
| max_time | hex(landscape_h) |
+---------------------+------------------+
| 2024-01-17 22:13:03 | 3100000 |
| 2025-02-28 22:00:47 | 6000000 |
| 2025-03-05 20:59:13 | 1300000 |
| 2025-03-05 21:25:34 | 900000 |
| 2025-03-06 09:50:04 | 3200000 |
| 2025-03-06 10:15:14 | 400000 |
| 2025-03-06 10:25:08 | 3800000 |
| 2025-03-06 10:34:02 | 3600000 |
| 2025-03-06 10:35:47 | 3300000 |
| 2025-03-06 10:45:56 | 3400000 |
| 2025-03-06 10:55:32 | 5000000 |
| 2025-03-06 11:09:07 | 2200000 |
| 2025-03-06 11:12:51 | D00000 |
| 2025-03-06 11:14:32 | 3500000 |
+---------------------+------------------+
14 rows in set (1 min 56.26 sec)
The event table contains up to date information for all 14 landscapes.
mysql> select max(time) max_time, hex(landscape_h) from event where time > '2023-01-01' group by landscape_h order by max_time;
+---------------------+------------------+
| max_time | hex(landscape_h) |
+---------------------+------------------+
| 2025-03-01 08:06:44 | 6000000 |
| 2025-03-06 10:40:47 | 3800000 |
| 2025-03-06 10:45:11 | 1300000 |
| 2025-03-06 10:50:05 | 3400000 |
| 2025-03-06 11:08:23 | 3300000 |
| 2025-03-06 11:11:23 | 5000000 |
| 2025-03-06 11:13:13 | D00000 |
| 2025-03-06 11:14:17 | 3100000 |
| 2025-03-06 11:15:32 | 3500000 |
| 2025-03-06 11:20:32 | 2200000 |
| 2025-03-06 11:25:32 | 3200000 |
| 2025-03-06 11:28:21 | 400000 |
| 2025-03-06 11:30:01 | 900000 |
| 2025-03-06 11:37:53 | 3600000 |
+---------------------+------------------+
14 rows in set (0.67 sec)
Something that may be relevant is that I have noticed what appears to be a backlog of alarmbucket entries in the bucketactivitylog table for the affected landscape, that correspond to the period of no entries in the alarminfo table.
mysql> select min(creation_time), max(creation_time),count(*) from bucketactivitylog where landscape_h = 0x3100000 and bucket_table_name like "alarmbucket%" and destroy_time is null;
+---------------------+---------------------+----------+
| min(creation_time) | max(creation_time) | count(*) |
+---------------------+---------------------+----------+
| 2024-01-17 22:17:10 | 2025-03-06 11:40:46 | 10139 |
+---------------------+---------------------+----------+
1 row in set (0.65 sec)
I have tried disabling and reenabling the landscape on the SRM Spectrum Status Admin page as well as restarting tomcat and rebooting the SRM OneClick server, but the alarminfo table is still not updating. The SpectroServer for that landscape has been restarted many times in the last year.
any supported Spectrum release
catalina.out
2025-05-06 12:50:19,193 [SRM/AlarmHandler51380224/bucketReader] WARN com.aprisma.errorlog - (SRM_Alarms) - (ERROR) - Unknown exception encountered while processing name events: processing halted for all servers
Caused by: org.springframework.jdbc.UncategorizedSQLException: PreparedStatementCallback; uncategorized SQLException for SQL [SELECT event_ID, time, event_seq, model_key, vardata, type, model_h, domain, event_key FROM alarmbucket51380224_1705529830284 WHERE event_seq BETWEEN ? AND ? ORDER BY event_seq]; SQL state [HY000]; error code [130]; Incorrect file format 'alarmbucket51380224_1705529830284'; nested exception is java.sql.SQLException: Incorrect file format 'alarmbucket51380224_1705529830284'
at org.springframework.jdbc.core.JdbcTemplate.translateException(JdbcTemplate.java:1578)
at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:669)
at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:715)
at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:746)
at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:752)
at com.aprisma.spectrum.app.repmgr.dc.db.dao.jdbc.JdbcEventBucketDAO.findEvents(JdbcEventBucketDAO.java:101)
at com.aprisma.spectrum.app.repmgr.dc.event.handler.AbstractBucketReader.readBucket(AbstractBucketReader.java:120)
at com.aprisma.spectrum.app.repmgr.dc.event.handler.BucketReaderScheduler$ScheduledReader.run(BucketReaderScheduler.java:118)
at com.aprisma.util.thread.JobQueue.runJobThread(JobQueue.java:221)
at com.aprisma.util.thread.JobQueue$JobRunnable.run(JobQueue.java:24)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.sql.SQLException: Incorrect file format 'alarmbucket51380224_1705529830284'
check the following:
alarmbucket51380224_1705529830284 exists in the location the database is looking for. alarmbucket51380224_1705529830284; 3. Examine Schema: If the table exists, compare its schema to what your application code expects. Are the column names and data types the same? Any mismatches here could cause this error. You can check this with a simple query: DESC alarmbucket51380224_1705529830284;
4. Examine MySQL Logs and Errors:
Try to compare the table file permissions and its contents with other bucket files of other landscapes.
THIS SPECIFIC CASE CAUSE:
Those tables do look to be corrupt as suggested.
Here's what I see when looking at this ...
The files exist and are of zero length. They are in the right directory an permissions look fine to me. They are the same as other similar files at that time.
[specadmin@<HOST> reporting]$ pwd
/opt/spectrum/mysql/data/reporting
[specadmin@<HOST> reporting]$ ls -l alarmbucket51380224_1705529830284.MY*
-rw-r----- 1 specadmin specadmin 0 Jan 17 2024 alarmbucket51380224_1705529830284.MYD
-rw-r----- 1 specadmin specadmin 0 Jan 17 2024 alarmbucket51380224_1705529830284.MYI
[specadmin@<HOST> reporting]$ ls -l genericbucket51380224_1705529836875.MY*
-rw-r----- 1 specadmin specadmin 0 Jan 17 2024 genericbucket51380224_1705529836875.MYD
-rw-r----- 1 specadmin specadmin 0 Jan 17 2024 genericbucket51380224_1705529836875.MYI
[specadmin@<HOST> reporting]$
Looking at the tables in mysql they look fine, but there is an error on the select statement.
mysql> describe alarmbucket51380224_1705529830284;
+-----------+-----------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+-----------------+------+-----+---------+----------------+
| event_seq | bigint unsigned | NO | PRI | NULL | auto_increment |
| event_key | bigint unsigned | NO | | NULL | |
| event_ID | binary(26) | NO | UNI | NULL | |
| domain | int unsigned | NO | MUL | NULL | |
| model_key | int unsigned | NO | | NULL | |
| model_h | int unsigned | NO | | NULL | |
| time | datetime | NO | MUL | NULL | |
| type | int unsigned | NO | MUL | NULL | |
| vardata | blob | YES | | NULL | |
+-----------+-----------------+------+-----+---------+----------------+
9 rows in set (0.00 sec)
mysql> select count(*) from alarmbucket51380224_1705529830284;
ERROR 130 (HY000): Incorrect file format 'alarmbucket51380224_1705529830284'
mysql> describe genericbucket51380224_1705529836875;
+-----------+-----------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+-----------------+------+-----+---------+----------------+
| event_seq | bigint unsigned | NO | PRI | NULL | auto_increment |
| event_key | bigint unsigned | NO | | NULL | |
| event_ID | binary(26) | NO | UNI | NULL | |
| domain | int unsigned | NO | MUL | NULL | |
| model_key | int unsigned | NO | | NULL | |
| model_h | int unsigned | NO | | NULL | |
| time | datetime | NO | MUL | NULL | |
| type | int unsigned | NO | MUL | NULL | |
| vardata | blob | YES | | NULL | |
+-----------+-----------------+------+-----+---------+----------------+
9 rows in set (0.00 sec)
mysql> select count(*) from genericbucket51380224_1705529836875;
ERROR 130 (HY000): Incorrect file format 'genericbucket51380224_1705529836875'
mysql>
RESOLUTION:
To be safe take a backup of db or at least the bucketactivitylog table.
ref:
Useful commands for MySQL dumps
and then run a delete query on the bucketactivitylog table for those corrupted file names:
Delete from bucketactivitylog where <column-name> = <corrupted file name>;
recycle tomcat and validate with SRM Alarm Handler Debug enabled.