rahtml
/

Qwen3-Coder-30B-A3B-Instruct-NVFP4

8-bit precision

Model card Files Files and versions

Description

NVFP4 Quantization of Qwen/Qwen3-Coder-30B-A3B-Instruct using TensorRT-Model-Optimizer. KV Cache quantized to FP8 for compatibility with inference backends.

Downloads last month: 100

Safetensors

Model size

16B params

Tensor type

BF16

·

F8_E4M3

·

U8

·

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rahtml/Qwen3-Coder-30B-A3B-Instruct-NVFP4

Base model

Qwen/Qwen3-Coder-30B-A3B-Instruct

Quantized

(113)

this model