"Warehouse collection ingestion from document" tasks in Tanzu Hub show no logs
search cancel

"Warehouse collection ingestion from document" tasks in Tanzu Hub show no logs

book

Article ID: 439606

calendar_today

Updated On:

Products

VMware Tanzu Platform - Hub

Issue/Introduction

  • In Tanzu Hub 10.4 GUI, when checking Tasks for Foundation Attach operations, the task shows: "Warehouse collection ingestion from document" failures.
  • No logs are populated for the individual sub-tasks.
  • From an SSH session to the registry VM, checking pods in the tanzusm namespace with commands like: kubectl get pods -n tanzusm, the chi-clickhouse pods are stuck in CrashLoopBackOff state.
  • You can filter other pods out and only view non running pods with: kubectl get pods -n tanzusm | egrep -v "Running|Completed"
  • Example of failing chi-clickhouse pods:

    registry/########-####-####-####-############:$ kubectl get pods -n tanzusm | grep -v Run
    NAME                                          READY    STATUS              RESTARTS         AGE
    chi-clickhouse-metrics-default-0-0-0          0/1      CrashLoopBackOff    428 (71s ago)    47h
    chi-clickhouse-metrics-default-2-0-0          0/1      CrashLoopBackOff    392 (3m11s ago)  47h

  • When viewing chi-clickhouse pods, you see 7 pods present: kubectl get pods -A | grep chi-clickhouse-default

    NAMESPACE  NAME                                    READY  STATUS   RESTARTS
    tanzusm    chi-clickhouse-metrics-default-0-0-0    1/1    Running       990
    tanzusm    chi-clickhouse-metrics-default-1-0-0    1/1    Running         0
    tanzusm    chi-clickhouse-metrics-default-2-0-0    1/1    Running       976
    tanzusm    chi-clickhouse-metrics-default-3-0-0    1/1    Running         0
    tanzusm    chi-clickhouse-metrics-default-4-0-0    1/1    Running         0
    tanzusm    chi-clickhouse-metrics-default-5-0-0    1/1    Running         0
    tanzusm    chi-clickhouse-metrics-default-6-0-0    1/1    Running         0

 

Environment

Tanzu Hub 10.4.x

Cause

This failure occurs when the Metric Store Job Instances in Tanzu Hub tile -> Resource Config section have been scaled from 3 to 7 instances directly (skipping 5). This increase skips the maximum instance count of 5 for the Metric Store job instances and leads to data inconsistency on the Clickhouse storage. Increasing to 7 instances directly is not supported for Metric Store job instances at this time. It is recommended not to go beyond 5 instances, and when increasing, do not increase more than 2 at a time up to a max of 5 instances. For example: 1 instance would need to be increased to 3 instances before being increased to 5 instances. It is preferred to scale the Metric Store VMs vertically before scaling horizontally.

Resolution

The data inconsistency caused by this unsupported increase in instance replicas requires rebuilding the Clickhouse store.

 

Use the the following KB for reference to rebuild the Clickhouse store.