We deployed the M9 version of Authentication Hub in our dev region, then began experiencing "invalid x-flow-state header" errors. The logs returned messages such as:
The errors occurred at the following endpoints:
The authentication flow succeeded with multiple submissions. The failed and successful submissions shared the same flow state id. The fail/success rate was a nearly even 50/50 split, and we had two instances. Thus we believe this was a replication issue.
Remediation Attempts
Firstly, we restarted our distributed cache. This did not fix our issue.
Secondly, we scaled all our instances down to one & then restored to the usual number of instances. No more errors have occurred since then but we can not scale up and this need to be addressed.
Release : October..01 release ( M9)
To address the issue we made changes in tenant settings parameter persistentStoreForTransactionalData = "db" and set it to 'db' rather 'cache' . We left the deployment running with 3 Hazelcast pods and issue was not observed, when this configuration in place the underlying "flowstate" data is persisted in DB and not in Hazelcast and this setting has addressed the issue.