The following version of TAS include a version of usage-service that may lose some data. This affects the upgrade path, fresh installs are unaffected. 4.0.9 - 3.0.17 - 2.13.27 - 2.11.45 versions of TAS
The potential for data loss is for the service usage data - if the customer has deployed a new service after deploying the above TAS versions. App and Task usage are unaffected. Customers would need to either execute the workaround or upgrade to a new version of TAS with an upgrade usage service within 30 days otherwise purge and reseed would be the only way forward.
Patches are now available for TAS. push-usage-service-release release version 674.0.69 contains the necessary mitigations for this issue.
TAS versions 4.0.10, 3.0.18, 2.13.28, and 2.11.46 are the fixed patch versions.
For customers unable to immediately upgrade, the below workaround can also be applied to prevent data loss of usage service data.
How to tell if you're affected
Use the cf
CLI to get logs from the app-usage-worker
app in the system
space of the system
org. If you're being affected by this issue, you should find that roughly every 5 minutes you get errors about Mysql2:Error: Duplicate entry
. They will look something like this:
2023-10-06T11:10:05.77+0100 [APP/PROC/WEB/0] OUT I, [2023-10-06T10:10:05.776183 #7] INFO -- : ActiveRecord::RecordNotUnique: Mysql2::Error: Duplicate entry '4b0c79bd-9528-412d-a94f-3526bd827a4f' for key 'service_events.index_service_events_on_guid' 2023-10-06T11:10:05.77+0100 [APP/PROC/WEB/0] OUT I, [2023-10-06T10:10:05.776250 #7] INFO -- : /home/vcap/deps/0/vendor_bundle/ruby/3.2.0/gems/mysql2-0.5.5/lib/mysql2/client.rb:151:in `_query'
If you connect to the SQL database that backs usage service (often this is the app_usage_service
database on the CF internal MySQL instance that's included in the CF deployment), you can also inspect a table called service_events_fetcher_job_run_logs
. If you are affected, you should see that all the entries in that table have a created_at
timestamp that is earlier than the upgrade which subjected you to this bug.
Workaround:
1. Identify your cf deployment name via bosh cli.
E.g. by runningbosh ds --json | jq '.Tables[].Rows[] | select( .name | startswith("cf-")).name' -r
2. SSH onto one of the Mysql instances:bosh -d CF_NAME ssh mysql/0
3. Connect to the db with the mysql cli
sudo mysql --defaults-file=/var/vcap/jobs/pxc-mysql/config/mylogin.cnf app_usage_service
4. Execute the below SQL statement
INSERT INTO latest_service_usage_event_guids(service_event_guid,event_created_at,created_at,updated_at) SELECT guid, occurred_at, created_at, updated_at FROM service_events order by id desc LIMIT 1;
How to tell if your intervention fixed the issue
Wait for 10 or 15 minutes after performing the fix operations. Then use the cf
CLI to get logs from the app-usage-worker
app in the system
space of the system
org. You should find that Mysql2:Error: Duplicate entry
are no-longer occurring.Since you connected to the SQL backing database to execute the fix anyway, you can also inspect the service_events_fetcher_job_run_logs
table. You should find that since you applied the fix, new entries have started being created in this table. Those new entries should have a timestamp in their created_at
field which is more recent than the time at which you applied the fix.