使用vllm:nightly部署bf16版本模型报错
vllm serve
--served-model-name step3p5-flash
--tensor-parallel-size 8
--enable-expert-parallel
--disable-cascade-attn
--reasoning-parser step3p5
--enable-auto-tool-choice
--tool-call-parser step3p5
--hf-overrides '{"num_nextn_predict_layers": 1}'
--speculative_config '{"method": "step3p5_mtp", "num_speculative_tokens": 1}'
--trust-remote-code
日志
2026-03-31T15:45:31.601956523+08:00 nm129-a100-80g-39 (Worker_TP0_EP0 pid=1573108)
2026-03-31T15:45:31.792178595+08:00 nm129-a100-80g-39 (Worker_TP0_EP0 pid=1573108) INFO 03-31 15:45:31 [default_loader.py:384] Loading weights took 235.94 seconds
2026-03-31T15:45:31.811215788+08:00 nm129-a100-80g-39 (Worker_TP0_EP0 pid=1573108) INFO 03-31 15:45:31 [gpu_model_runner.py:4747] Loading drafter model...
2026-03-31T15:45:31.974303148+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] WorkerProc failed to start.
2026-03-31T15:45:31.974348980+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] Traceback (most recent call last):
2026-03-31T15:45:31.974360194+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/multiproc_executor.py", line 826, in worker_main
2026-03-31T15:45:31.974367954+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] worker = WorkerProc(*args, **kwargs)
2026-03-31T15:45:31.974374628+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] ^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-03-31T15:45:31.974382095+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
2026-03-31T15:45:31.974388503+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] return func(*args, **kwargs)
2026-03-31T15:45:31.974417289+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] ^^^^^^^^^^^^^^^^^^^^^
2026-03-31T15:45:31.974426640+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/multiproc_executor.py", line 613, in init
2026-03-31T15:45:31.974438539+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] self.worker.load_model()
2026-03-31T15:45:31.974445786+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 323, in load_model
2026-03-31T15:45:31.974452717+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] self.model_runner.load_model(load_dummy_weights=load_dummy_weights)
2026-03-31T15:45:31.974459257+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
2026-03-31T15:45:31.974465528+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] return func(*args, **kwargs)
2026-03-31T15:45:31.974472188+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] ^^^^^^^^^^^^^^^^^^^^^
2026-03-31T15:45:31.974478899+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 4748, in load_model
2026-03-31T15:45:31.974485379+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] self.drafter.load_model(self.model)
2026-03-31T15:45:31.974491862+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/spec_decode/eagle.py", line 1255, in load_model
2026-03-31T15:45:31.974498080+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] self.model = self._get_model()
2026-03-31T15:45:31.974504296+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] ^^^^^^^^^^^^^^^^^
2026-03-31T15:45:31.974510574+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/spec_decode/eagle.py", line 1240, in _get_model
2026-03-31T15:45:31.974516733+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] model = get_model(
2026-03-31T15:45:31.974523081+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] ^^^^^^^^^^
2026-03-31T15:45:31.974529945+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/init.py", line 138, in get_model
2026-03-31T15:45:31.974536170+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] return loader.load_model(
2026-03-31T15:45:31.974542389+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] ^^^^^^^^^^^^^^^^^^
2026-03-31T15:45:31.974549005+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
2026-03-31T15:45:31.974555212+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] return func(*args, **kwargs)
2026-03-31T15:45:31.974561342+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] ^^^^^^^^^^^^^^^^^^^^^
2026-03-31T15:45:31.974598169+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/base_loader.py", line 55, in load_model
2026-03-31T15:45:31.974605980+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] model = initialize_model(
2026-03-31T15:45:31.974612162+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] ^^^^^^^^^^^^^^^^^
2026-03-31T15:45:31.974618414+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
2026-03-31T15:45:31.974624852+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] return func(*args, **kwargs)
2026-03-31T15:45:31.974631067+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] ^^^^^^^^^^^^^^^^^^^^^
2026-03-31T15:45:31.974637639+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py", line 57, in initialize_model
2026-03-31T15:45:31.974644827+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] model = model_class(vllm_config=vllm_config, prefix=prefix)
2026-03-31T15:45:31.974651122+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-03-31T15:45:31.974657707+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/step3p5_mtp.py", line 147, in init
2026-03-31T15:45:31.974664148+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] self.model = Step3p5AMultiTokenPredictor(
2026-03-31T15:45:31.974670343+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-03-31T15:45:31.974685167+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/step3p5_mtp.py", line 94, in init
2026-03-31T15:45:31.974691983+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] str(idx): Step3p5AMultiTokenPredictorLayer(
2026-03-31T15:45:31.974698358+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-03-31T15:45:31.974704958+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/step3p5_mtp.py", line 56, in init
2026-03-31T15:45:31.974711171+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] self.mtp_block = Step3p5DecoderLayer(
2026-03-31T15:45:31.974717551+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] ^^^^^^^^^^^^^^^^^^^^
2026-03-31T15:45:31.974723749+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/step3p5.py", line 449, in init
2026-03-31T15:45:31.974730290+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] and config.layer_types[layer_idx]
2026-03-31T15:45:31.974736704+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] ~~~~~~~~~~~~~~~~~~^^^^^^^^^^^
2026-03-31T15:45:31.974745710+08:00 nm129-a100-80g-39 (Worker_TP7_EP7 pid=1573115) ERROR 03-31 15:45:31 [multiproc_executor.py:857] IndexError: list index out of range
2026-03-31T15:45:31.991360914+08:00 nm129-a100-80g-39 P2P/CUMEM/read
2026-03-31T15:45:31.991386929+08:00 nm129-a100-80g-39 NH-DC-NM129-B03-12U-GPU-39:1573115:1581790 [7] NCCL INFO Channel 17/0 : 7[7] -> 0[0] via P2P/CUMEM/read
2026-03-31T15:45:31.991393133+08:00 nm129-a100-80g-39 NH-DC-NM129-B03-12U-GPU-39:1573115:1581790 [7] NCCL INFO Channel 18/0 : 7[7] -> 0[0] via P2P/CUMEM/read
2026-03-31T15:45:31.991397920+08:00 nm129-a100-80g-39 NH-DC-NM129-B03-12U-GPU-39:1573115:1581790 [7] NCCL INFO Channel 19/0 : 7[7] -> 0[0] via P2P/CUMEM/read
2026-03-31T15:45:31.991402151+08:00 nm129-a100-80g-39 NH-DC-NM129-B03-12U-GPU-39:1573115:1581790 [7] NCCL INFO Channel 20/0 : 7[7] -> 0[0] via P2P/CUMEM/read
2026-03-31T15:45:31.991406410+08:00 nm129-a100-80g-39 NH-DC-NM129-B03-12U-GPU-39:1573115:1581790 [7] NCCL INFO Channel 21/0 : 7[7] -> 0[0] via P2P/CUMEM/read
2026-03-31T15:45:31.991410625+08:00 nm129-a100-80g-39 NH-DC-NM129-B03-12U-GPU-39:1573115:1581790 [7] NCCL INFO Channel 22/0 : 7[7] -> 0[0] via P2P/CUMEM/read
2026-03-31T15:45:31.991414746+08:00 nm129-a100-80g-39 NH-DC-NM129-B03-12U-GPU-39:1573115:1581790 [7] NCCL INFO Channel 23/0 : 7[7] -> 0[0] via P2P/CUMEM/read
2026-03-31T15:45:31.991419520+08:00 nm129-a100-80g-39 NH-DC-NM129-B03-12U-GPU-39:1573115:1581790 [7] NCCL INFO Connected all rings, use ring PXN 0 GDR 1
2026-03-31T15:45:32.019203259+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] WorkerProc failed to start.
2026-03-31T15:45:32.019243448+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] Traceback (most recent call last):
2026-03-31T15:45:32.019253769+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/multiproc_executor.py", line 826, in worker_main
2026-03-31T15:45:32.019261878+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] worker = WorkerProc(*args, **kwargs)
2026-03-31T15:45:32.019267458+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] ^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-03-31T15:45:32.019274849+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
2026-03-31T15:45:32.019279647+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] return func(*args, **kwargs)
2026-03-31T15:45:32.019285695+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] ^^^^^^^^^^^^^^^^^^^^^
2026-03-31T15:45:32.019292778+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/multiproc_executor.py", line 613, in init
2026-03-31T15:45:32.019298024+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] self.worker.load_model()
2026-03-31T15:45:32.019302719+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 323, in load_model
2026-03-31T15:45:32.019309139+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] self.model_runner.load_model(load_dummy_weights=load_dummy_weights)
2026-03-31T15:45:32.019313960+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
2026-03-31T15:45:32.019318967+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] return func(*args, **kwargs)
2026-03-31T15:45:32.019323488+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] ^^^^^^^^^^^^^^^^^^^^^
2026-03-31T15:45:32.019328194+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py", line 4748, in load_model
2026-03-31T15:45:32.019351590+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] self.drafter.load_model(self.model)
2026-03-31T15:45:32.019356452+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/spec_decode/eagle.py", line 1255, in load_model
2026-03-31T15:45:32.019361333+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] self.model = self._get_model()
2026-03-31T15:45:32.019365858+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] ^^^^^^^^^^^^^^^^^
2026-03-31T15:45:32.019370609+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/v1/spec_decode/eagle.py", line 1240, in _get_model
2026-03-31T15:45:32.019375333+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] model = get_model(
2026-03-31T15:45:32.019379787+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] ^^^^^^^^^^
2026-03-31T15:45:32.019385976+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/init.py", line 138, in get_model
2026-03-31T15:45:32.019392441+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] return loader.load_model(
2026-03-31T15:45:32.019397175+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] ^^^^^^^^^^^^^^^^^^
2026-03-31T15:45:32.019401808+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
2026-03-31T15:45:32.019406856+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] return func(*args, **kwargs)
2026-03-31T15:45:32.019411427+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] ^^^^^^^^^^^^^^^^^^^^^
2026-03-31T15:45:32.019430101+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/base_loader.py", line 55, in load_model
2026-03-31T15:45:32.019435032+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] model = initialize_model(
2026-03-31T15:45:32.019439772+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] ^^^^^^^^^^^^^^^^^
2026-03-31T15:45:32.019444389+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper
2026-03-31T15:45:32.019449638+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] return func(*args, **kwargs)
2026-03-31T15:45:32.019454086+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] ^^^^^^^^^^^^^^^^^^^^^
2026-03-31T15:45:32.019458788+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py", line 57, in initialize_model
2026-03-31T15:45:32.019466602+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] model = model_class(vllm_config=vllm_config, prefix=prefix)
2026-03-31T15:45:32.019471354+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-03-31T15:45:32.019480659+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/step3p5_mtp.py", line 147, in init
2026-03-31T15:45:32.019485522+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] self.model = Step3p5AMultiTokenPredictor(
2026-03-31T15:45:32.019490112+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-03-31T15:45:32.019494649+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/step3p5_mtp.py", line 94, in init
2026-03-31T15:45:32.019499273+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] str(idx): Step3p5AMultiTokenPredictorLayer(
2026-03-31T15:45:32.019503771+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-03-31T15:45:32.019508585+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/step3p5_mtp.py", line 56, in init
2026-03-31T15:45:32.019513116+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] self.mtp_block = Step3p5DecoderLayer(
2026-03-31T15:45:32.019517813+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] ^^^^^^^^^^^^^^^^^^^^
2026-03-31T15:45:32.019522613+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] File "/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/step3p5.py", line 449, in init
2026-03-31T15:45:32.019527150+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] and config.layer_types[layer_idx]
2026-03-31T15:45:32.019531634+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] ~~~~~~~~~~~~~~~~~~^^^^^^^^^^^
2026-03-31T15:45:32.019536384+08:00 nm129-a100-80g-39 (Worker_TP4_EP4 pid=1573112) ERROR 03-31 15:45:32 [multiproc_executor.py:857] IndexError: list index out of range