Applications bound to a Redis/Valkey service instance are intermittently failing to perform write operations, which results in an InvalidDataAccessApiUsageException.
Observed error:
org.springframework.dao.InvalidDataAccessApiUsageException: READONLY You can't write against a read only replica.Concurrently, infrastructure monitoring shows that the service-metrics process on the Redis Master node is failing. Logs for the metrics process indicate:
ERR max number of clients reached
Elastic Application Runtime
Tanzu Platform for Cloud Foundry
Valkey on Cloud Foundry
You might see an output similar to the snippet below.The default Max Client limit in the Valkey tile is set to 1000; if this limit is reached, you will encounter the error mentioned above.
This can be validated using the redis CLI after SSHing into the Redis nodes. If you have multiple Redis nodes (i.e., in an HA setup), you will need to log in to all three to identify the master node. The steps to do this are as follows:
1) SSH into the Redis nodes.
bosh -d service-instance_uuid ssh redis/<index>2) Open the file `cat /var/vcap/jobs/redis/config/redis.conf` and grab the values for `masterauth` and `replica-announce-ip`.
3) Connect to the Redis instance using `redis-cli`.
/var/vcap/packages/redis/bin/redis-cli -h <hostname-######>.redis-instance.infra.service-instance-uuid.bosh -p 16379 --cacert /var/vcap/jobs/redis/etc/ca/redis_ca.crt --tlsNote: If TLS is not enabled, you can remove the tls option.
4) Once connected to the instance, you can authenticate using the command below.
AUTH <value retrieved from masterauth>5) Check the commands below to determine if the Redis node is a master or slave and to find its connection information.
info replication
info clientsYou might see an output similar to the below snippet:
connected_clients:998
cluster_connections:0
maxclients:1000
client_recent_max_input_buffer:20480
client_recent_max_output_buffer: 20504
blocked_clients:0
tracking_clients:0
pubsub clients:3
watching_clients:0
clients_in_timeout_table:0
total_watched_keys:0
total_blocking_keys:0
total_blocking_keys_on_nokey:0
paused_reason:none
paused_actions:none
paused_timeout_milliseconds:0
If `connected_clients` is almost equal to the value of `maxclients`, we might encounter the issue described in the Introduction section.
There are two ways to resolve this:
1) Checking the jedis pool parameter and tuning it.
You can check for any specific application that is exhausting this limit by identifying the most frequent source IP connected to the Redis master node using the netstat command:
netstat -anp | grep :6379 | grep ESTABLISHED | awk '{print $5) | cut -d: -fl | sort uniq -c | sort -nrOnce we obtain the Diego cell, we can then track it back to the corresponding application GUID, locate the application, and check the Jedis pool configuration. A sample snippet follows:
spring:
#Redis configuration
redis:
timeout: 3600
ssl: false # Enable SSL support.
jedis:
pool:
max-active: 1000
max-idle: 500
max-wait: 5000
min-idle: 250You can reduce the max-active, max-idle, and min-idle parameters accordingly.
2) Increase the max_clients parameter in the Valkey tile under the On-demand plan settings section. The default is 1000, but this can be increased to 10000 if necessary.
As of the latest version of Valkey (10.1.1 and 10.2.2), the maximum value we can set for the max clients parameter is 10,000. However, future releases will include an option to increase this limit if necessary.