devpreneur commited on
Commit
2a42b10
·
verified ·
1 Parent(s): 40e2af7

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +14 -5
README.md CHANGED
@@ -16,19 +16,28 @@ pipeline_tag: text-generation
16
 
17
  Fine-tuned Qwen3-4B model for terminal command generation. Optimized for LocalTerm app.
18
 
 
 
19
  ## Model Details
20
 
21
- - **Base Model**: mlx-community/Qwen3-4B-4bit
22
- - **Fine-tuning**: QLoRA (4-bit quantized, 16 layers)
 
 
23
  - **Training Data**: 388 examples, 74 terminal commands
24
  - **Accuracy**: 98% on test set (147/150 correct)
25
- - **Size**: ~2.3GB (4-bit quantized)
26
- - **Format**: MLX safetensors (merged, no adapter needed)
27
 
28
  ## Usage
29
 
30
  ### With MLX-LM (Python)
 
 
31
 
 
 
 
 
 
32
 
33
  ### With LocalTerm (macOS app)
34
  Model auto-downloads on first run. See [LocalTerm](https://github.com/aleonis/localterm).
@@ -44,7 +53,7 @@ Model auto-downloads on first run. See [LocalTerm](https://github.com/aleonis/lo
44
  ## Version History
45
 
46
  - **v2 (2026-01-22)**: Re-fused model with correct weight format
47
- - Fixed: prefix issue in LoRA merged weights
48
  - Now compatible with mlx-swift-lm
49
  - **v1 (2026-01-21)**: Initial release (had loading issues in Swift)
50
 
 
16
 
17
  Fine-tuned Qwen3-4B model for terminal command generation. Optimized for LocalTerm app.
18
 
19
+ > **Note**: HuggingFace shows "0.6B params" - this is incorrect. The actual model has **4 billion parameters** (4-bit quantized). HuggingFace miscalculates param count for MLX quantized safetensors files.
20
+
21
  ## Model Details
22
 
23
+ - **Base Model**: [mlx-community/Qwen3-4B-4bit](https://huggingface.co/mlx-community/Qwen3-4B-4bit)
24
+ - **Actual Parameters**: **4 billion** (same as base model)
25
+ - **Quantization**: 4-bit (MLX format, ~2.3GB file size)
26
+ - **Fine-tuning**: QLoRA on 16 layers
27
  - **Training Data**: 388 examples, 74 terminal commands
28
  - **Accuracy**: 98% on test set (147/150 correct)
 
 
29
 
30
  ## Usage
31
 
32
  ### With MLX-LM (Python)
33
+ ```python
34
+ from mlx_lm import load, generate
35
 
36
+ model, tokenizer = load("mlxstudio/qwen3-4b-4bit-terminal")
37
+ prompt = "how to create a git repository"
38
+ response = generate(model, tokenizer, prompt=prompt, max_tokens=100)
39
+ print(response)
40
+ ```
41
 
42
  ### With LocalTerm (macOS app)
43
  Model auto-downloads on first run. See [LocalTerm](https://github.com/aleonis/localterm).
 
53
  ## Version History
54
 
55
  - **v2 (2026-01-22)**: Re-fused model with correct weight format
56
+ - Fixed: `.linear.` prefix issue in LoRA merged weights
57
  - Now compatible with mlx-swift-lm
58
  - **v1 (2026-01-21)**: Initial release (had loading issues in Swift)
59