vCenter Server Appliance の /storage/log 領域が 80% 以上となっている場合のトラブルシューティング

Products

VMware vCenter Server

Issue/Introduction

免責事項：これは英文の記事「Troubleshooting vCenter Appliance /storage/log directory is 80% or more full (broadcom.com)」の日本語訳です。記事はベストエフォートで翻訳を進めているため、ローカライズ化コンテンツは最新情報ではない可能性があります。最新情報は英語版の記事で参照してください。

この記事では、vCenter Server Appliance で /storage/log パーティションがひっ迫している場合のトラブルシューティングと解決方法について説明します。

Symptoms:

vCenter Server が次のようなエラーをレポートする
- vSphere UI Health Alarm", "Log disk exhaustion on vcenter name
- Database Health Alarm", "Core and Inventory Disk Exhaustion on vcenter name
vCenter Server が、503 Service Unavailable エラーでアクセスできない場合がある
vSphere Appliance Management Interface (VAMI) の [監視] > [ディスク] に /storage/log の使用率が 75% またはそれ以上であることが表示される
- 75% 以上の使用率が継続すると黄色の警告ステータスをトリガーします
- 85% に達すると赤のクリティカルアラームをトリガーします

Environment

VMware vCenter Server Appliance 6.0.x
VMware vCenter Server Appliance 6.5.x
VMware vCenter Server Appliance 6.7.x
VMware vCenter Server 7.0.x
VMware vCenter Server 8.0

Cause

以下の潜在的な要因があります。

vCenter Server ログバンドルが生成後にクリアされない
非常に高い頻度で大量のイベントのログ出力がある
Apache Tomcat Java Servlet サービスなどのサービスでのファイルのクリーンアップの失敗
/storage/log パーティションのサイズが不足している

Resolution

次の手順にて vCenter Server の /storage/log 領域を枯渇させている事象を調査します。

root 権限を持つユーザで vCenter Server Appliance へ SSH ログインします。
/storage/log ディレクトリへ移動します。
```
# cd /storage/log
```

容量の大きなファイルの確認のため、次のコマンドを実行します。

# find . -type f -print0 | xargs -0 du -h | sort -rh | head -n 10

このコマンドはディレクトリ内のサイズの大きい上位 10 ファイルをリストします。出力の例は以下になります。

   3.7G    ./vmware/wcp/stdstream.log-2.stderr
   2.5G    ./vmware/wcp/stdstream.log-1.stderr
   266M    ./vmware/wcp/stdstream.log.stderr
   190M    ./vmware/vpxd/vpxd-profiler-154.log
   104M    ./vmware/procstate
   101M    ./vmware/vsphere-ui/logs/threadmonitor1.log
   83M     ./vmware/wcp/stdstream.log-4.stderr
   46M     ./vmware/vpxd-svcs/perf.log.37
   45M     ./vmware/sso/ssoAdminServer.log
   41M     ./vmware/vsphere-ui/logs/threadmonitor.log

ファイル数の多いディレクトリの確認のため、次のコマンドを実行します。
```
# find ./ -type d -exec sh -c 'echo -n "{}: " && find "{}" -type f | wc -l' \; | awk '$2 > 100' | sort -k2,2nr 
```
このコマンドは 100 以上のファイルを含むディレクトリをファイル数の多い順にリストします。出力の例は以下になります。
```
   ./: 1442
   ./etc: 486
   ./etc/vmware: 366
   ./var: 321
   ./var/run: 275
   ./var/log: 274
   ./usr: 108 
```
容量を多く消費している要因を、これらの容量の大きなファイルやファイル数の多いディレクトリの確認結果より調査します。

調査結果より、以下の既知事象に該当する問題が無いかを確認します。

影響のあるバージョン	関連する KB のリンク
6.0	/storage/log partition full due to cloudvm-ram-size.log file rotation is not working in vCenter Server Appliance(318468)
6.0 Update 3 未満 6.5 Update 1 未満	/storage/log partition full due to SSO log files are not compressed in vCenter Server Appliance/storage/log partition full due to SSO log files are not compressed in vCenter Server Appliance(341135)
7.0 Update 1C 未満	vCenter Appliance /storage/log partition full due to excessive pod-startup.log files(318217)
7.0 before Update 3c	vmafdd.log is not being compressed which eventually leads to "log disk exhaustion" warning on the vCenter (318575) vCenter Server /storage/log filling up due to localhost_access.log and catalina.log in sso and lookupsvc log directories (318209)
7.0 Update 3o 未満 8.0 Update 1 未満	/storage/log filling up with imfile-state files \| rsyslogd (318149)
7.0, 8.0 の最新バージョンでも未解消	vCenter Server Appliance 7.0.x /storage/log partition runs out of space due to VMware Analytics service log file (analytics-runtime.log.stderr) (318203)
7.0 Update 1 - 7.0 Update 3 にて一部修正 - 8.0 にて修正	vCenter has a large number of localhost_access log files generated under /storage/log/vmware/eam/web/ (326212)
7.0 Update 2 - 7.0 Update 3 にて修正	The /storage/log volume is filling up in vCenter 7.0 U2 due to growing sps-runtime.log.stderr (318194)
8.0。 - 8.0 Update 3 にて修正	VCSA 8.0 : support bundle file is not removed under /storage/log after exporting it via vSphere Client (323151)
すべてのバージョン	/storage/log partition full due to Large Java dump files created under /storage/log/vmware/perfcharts in vCenter Server Appliance(318709) /storage/log becomes 100% full when exporting the vCenter Appliance log bundles(318709) vCenter Appliance /storage/log partition full due to excessive pod-startup.log files(318217) VCSA /storage/log/ partition runs out of space due to java hprof dumps created by VMware Analytics service(318173) Log Disk Exhaustion caused by wcp stdstream.log(326227)

これら既知の事象に該当する問題を確認できず、また各コマンドの結果にてひっ迫の要因を確認できなかった場合、/storage/log のサイズがログの出力量に対して不足している可能性があります。
その際は vCenter Server Appliance disk space is full を参照し、/storage/log の仮想ディスクの拡張を検討してください。

Additional Information

vCenter Server Appliance disk space is full (318953)
vCenter Server /storage/log filling up due to localhost_access.log and catalina.log in sso and lookupsvc log directories(318209)

Impact/Risks:
デフォルトの設定で、/storage/log パーティションの使用率が 75% に達した状態が 10 分継続した時：

アラームがトリガされます。

WARNING: 上記の理由により、ファイルの削除またはディスクのサイズ変更を行う前には vCenter Server Appliance の適切なバックアップを作成してください。

==============================================

Resolution に記載の find コマンドに加えて、下記の du コマンドもファイルシステム内の容量の大きいディレクトリを速やかに特定するのに役立ちます:

du -xh -d 1 | sort -rh

このコマンドを実行し、続けて 'cd' コマンドにてファイルシステム内の容量の大きいサブディレクトリへ移動します。
再度 du コマンドを実行し、ディスクを多く消費している原因を調査します。この手順を必要なだけサブディレクトリ内で繰り返します。
- OPTIONAL: '-d' オプションの値を変更することで、コマンドで出力するディレクトリの階層を増やすことができます。'-d 1' の場合はカレントディレクトリから 1 階層まで表示することができます。表示するディレクト数を 1 増やすことで、より速やかにディレクトリを評価できます。
ディスク消費量の多いディレクトリの確認が終わったら、'ls -lsah' コマンドを実行し、ディレクトリ内のファイルをリストします。
- ディスク領域の使用率が高くなった原因となる大きなファイルまたは多数のファイルがあるかどうかを特定し、必要に応じて対処をします。