"Datastore cluster is not stable. Please resolve any issues with datastore cluster and retry. Group type : DATASTORE, status : DEGRADED."
# get cluster status
NSX-FQDN/Hostname> get cluster statusWed Jan 21 2025 UTC 07:06:27.343Cluster Id: #######-########-#######Overall Status: DEGRADEDGroup Type: DATASTOREGroup Status: DEGRADEDMembers: UUID FQDN IP STATUS #######-########-####### <NSX-FQDN1/Hostname> x.x.x.x DOWN #######-########-####### <NSX-FQDN2/Hostname> x.x.x.x UP #######-########-####### <NSX-FQDN3/Hostname> x.x.x.x UP
# cat /config/corfu/LAYOUT_CURRENT.ds
root@<:~# cat /config/corfu/LAYOUT_CURRENT.dsNSX-FQDN1/Hostname>{"layoutServers": ["#.#.#.1:9000","#.#.#.2:9000","#.#.#.3:9000"],"sequencers": ["#.#.#.1:9000","#.#.#.2:9000","#.#.#.3:9000"],"segments": [{"replicationMode": "CHAIN_REPLICATION","start": 0,"end": -1,"stripes": [{"logServers": ["#.#.#.1:9000","#.#.#.2:9000"]}]}],"unresponsiveServers": ["#.#.#.3:9000"],"epoch": 2365,"clusterId": "#######-#######-############"}
VMware NSX-T Data Center
VMware NSX
Data corruption on a Manager node can cause the datastore issues to be reported in NSX. The underlying cause for the corruption could be caused due to various factors, such as underlying storage issues or file system errors on that Manager VM.
/config partition utilization using the below command on each Manager node:df -h/config usage in low or single-digitsLAYOUT_CURRENT.ds file with the following on all managers:cat config/corfu/LAYOUT_CURRENT.dsroot@<:~# cat /config/corfu/LAYOUT_CURRENT.dsNSX-FQDN1/Hostname>
{
"layoutServers": [
"#.#.#.1:9000",
"#.#.#.2:9000",
"#.#.#.3:9000"
],
"sequencers": [
"#.#.#.1:9000",
"#.#.#.2:9000",
"#.#.#.3:9000"
],
"segments": [
{
"replicationMode": "CHAIN_REPLICATION",
"start": 0,
"end": -1,
"stripes": [
{
"logServers": [
"#.#.#.1:9000",
"#.#.#.2:9000"
]
}
]
}
],
"unresponsiveServers": [
"#.#.#.3:9000"
],
"epoch": 2365,
"clusterId": "#######-#######-############"
}
NOTE: In the above output, the 3rd manager node is listed as unresponsive on port 9000 (seen in the unresponsiveServers section). Further, the segments section shows the 2 managers listed in the "logServers" brackets have a start value of 0 and end of -1; this indicates the 2 nodes have full segment visibility on all sequence address spaces (note that this is where the storage units are mapped to). However, the 3rd node is not listed and therefore lacks this visibility.
unresponsiveServers" section should have the "service corfu-server status" checked to make sure the corfu-server service is running with "service corfu-server status".
service corfu-server start"tail -F /var/log/corfu/corfu.9000.log" and wait for the service to initialize.get cluster config" or "get nodes" to identify the uuid of the 3 Manager nodes# <NSX-FQDN1/Hostname> > detach node <node_UUID>get cluster status/get cluster config"./var/log/cbm/cbm.log for errors around the string "detach"