DBLoader failing to complete init-db due to Clickhouse suspicious broken parts
search cancel

DBLoader failing to complete init-db due to Clickhouse suspicious broken parts

book

Article ID: 430610

calendar_today

Updated On:

Products

WatchTower

Issue/Introduction

We are seeing the following when trying to start the data-insights dbloader pod after upgrading to 1.3.1:

Code: 159. DB::Exception: Received from clickhouse:9000. DB::Exception: Distributed DDL task /clickhouse/task_queue/ddl/query-0000000869 is not finished on 1 of 4 hosts (0 of them are currently executing the task, 0 are inactive). They are going to execute the query in background. Was waiting for 180.711875925 seconds, which is longer than distributed_ddl_task_timeout. (TIMEOUT_EXCEEDED)

Uploading a full dump after it's completed. Sending in the init-db logs initially.

Resolution

Most issues relating to clickhouse (dbloader init-container fails, query service or dbloader cannot connect to clickhouse) can be solved by restarting DBloader.

If the init container doesnt work on second run, restart clickhouse servers.

If still not resolved, then running the below k8 job modification, updated for namespace, image, and shards/replicas should work.

 
export USER=default
            export SHARDS=1
            export REPLICAS=2
            export NAMESPACE=<your_namespace>

            for shard in $(seq 0 $(( $SHARDS - 1 )) ); do
              for server in $(seq 0 $(( $REPLICAS - 1 )) ); do
                echo
                echo "running commands on shard $shard server $server !"
                echo

                clickhouse-client --host=clickhouse-shard$shard-$server.clickhouse-headless.$NAMESPACE.svc.cluster.local --port=9000 --user $USER --password $CLICKHOUSE_SUPERUSER_PWD --multiquery --query "SYSTEM RESTORE REPLICA watchtower.metrics_gauge_v0; select count() from watchtower.metrics_gauge_v0;" || true
                clickhouse-client --host=clickhouse-shard$shard-$server.clickhouse-headless.$NAMESPACE.svc.cluster.local --port=9000 --user $USER --password $CLICKHOUSE_SUPERUSER_PWD --multiquery --query "SYSTEM RESTORE REPLICA watchtower.metrics_daily_v0; select count() from watchtower.metrics_daily_v0;" || true
                clickhouse-client --host=clickhouse-shard$shard-$server.clickhouse-headless.$NAMESPACE.svc.cluster.local --port=9000 --user $USER --password $CLICKHOUSE_SUPERUSER_PWD --multiquery --query "SYSTEM RESTORE REPLICA watchtower.resources_v0; select count() from watchtower.resources_v0;" || true
                clickhouse-client --host=clickhouse-shard$shard-$server.clickhouse-headless.$NAMESPACE.svc.cluster.local --port=9000 --user $USER --password $CLICKHOUSE_SUPERUSER_PWD --multiquery --query "SYSTEM RESTORE REPLICA watchtower.metrics_hourly_v0; select count() from watchtower.metrics_hourly_v0;" || true
                clickhouse-client --host=clickhouse-shard$shard-$server.clickhouse-headless.$NAMESPACE.svc.cluster.local --port=9000 --user $USER --password $CLICKHOUSE_SUPERUSER_PWD --multiquery --query "SYSTEM RESTORE REPLICA watchtower.mlhighway_v0; select count() from watchtower.mlhighway_v0;" || true
                clickhouse-client --host=clickhouse-shard$shard-$server.clickhouse-headless.$NAMESPACE.svc.cluster.local --port=9000 --user $USER --password $CLICKHOUSE_SUPERUSER_PWD  --multiquery --query "SYSTEM RESTORE REPLICA watchtower.mlalerts_v0; select count() from watchtower.mlalerts_v0;" || true

              done
done