During GenAI deployment, vllm-worker job is reported failure at post-start phase
Task 12345 | 03:11:00 | L starting jobs: ###### (0) (canary)
Task 12345 | 03:11:05 | L executing post-start: ###### (0) (canary) (00:33:22)
L Error: Action Failed get_task: Task "#### result: 1 of 2 post-start scripts failed. Failed Jobs: vllm-worker. Successful Jobs: bosh-dns.
Task 12345 | 03:15:07 | Error: Action Failed get_task: Task "#### result: 1 of 2 post-start scripts failed. Failed Jobs: vllm-worker. Successful Jobs: bosh-dns.
According to vllm-worker.stderr.log, the process fails to start due to
RuntimeError: Failed to infer device type, please set the environment variable `VLLM_LOGGING_LEVEL=DEBUG` to turn on verbose logging to help debug the issue.
In the meanwhile, post-start.stdout.log of vllm-worker raise error "No platform detected".
INFO 10-25 03:12:35 [__init__.py:220] No platform detected, vLLM is running on UnspecifiedPlatform
WARNING 10-25 03:12:35 [_custom_ops.py:20] Failed to import from vllm._C with ImportError('libcuda.so.1: cannot open shared object file: No such file or directory')
GenAI on Tanzu Platform
CPU-based inference is not supported in the Tanzu GenAI tile. At "Model Config", this is documented at Comparative Analysis: vLLM versus Ollama for GenAI Inference. If vllm model is configured with cpu processing technology, vllm-worker will not be able to start and raise the errors.
To resolve the problem, please configure non-CPU-based inferences with vllm models.