danielhanchen's picture
Update 4-bit quant: 8-bit shared_expert/embed/lm_head, bf16 conv1d/gates (better KLD)
6700c3e verified