HyperLLM-4b v0.6

A specialized 4B parameter language model fine-tuned for Hyperliquid perpetual DEX trading assistance. Built on Qwen3-4B-Instruct using LoRA + DPO training.

Model Description

HyperLLM is designed to assist with:

Position sizing calculations - Risk-based position sizing with proper decimal handling
API structure understanding - Hyperliquid exchange API request/response formats
Trading mechanics - Perpetual futures concepts, margin modes, order types
Parameter validation - Validating trade parameters against exchange constraints
Edge case handling - Boundary conditions and unusual trading scenarios

Version History

v0.6 (Current - March 18, 2026)

Training Pipeline: SFT (~~6,700 examples) + DPO (~~1,800 preference pairs)

v0.6 is a recovery release that fixes evaluation extraction bugs and includes targeted training improvements.

Key Changes from v0.5:

Change	v0.5	v0.6	Impact
SFT Dataset Size	14,260	~6,700	Less dilution, more focused
General Instructions	5,711	1,200	Reduced interference
Adversarial DPO Pairs	Diluted 2:1	Doubled (400)	Better % handling
Market Knowledge	Added	Removed	Cleaner, more precise
Answer Format	None	Enforced	Better extraction

Major Improvements over v0.4:

Category	v0.4	v0.6	Change
Overall	75.0%	90.2%	+15.2%
Adversarial %	71.0%	93.0%	+22.0%
Multi-step	32.0%	92.3%	+60.3%
Position Sizing	81.7%	98.3%	+16.6%
Edge Cases	90.0%	95.0%	+5.0%
General Capability	96.4%	98.2%	+1.8%
Trading Mechanics	80.0%	90.0%	+10.0%
Parameter Validation	100%	100%	Maintained

Note: v0.6 results reflect corrected evaluation scoring after fixing an extraction bug that was grabbing question values instead of computed answers.

v0.5 (March 16, 2026)

Training Pipeline: SFT (14,260 examples) + DPO (3,057 pairs)

Issues: Dataset dilution caused -4.4% regression from v0.4. Doubled general instructions interfered with specialized training.

v0.4 (March 11, 2026)

Training Pipeline: SFT (6,782 examples) + DPO (1,400 pairs)

Established baseline with strong adversarial percentage handling (71%) and 100% parameter validation.

v0.3 (March 6, 2026)

Training Pipeline: SFT (7,028 examples) + DPO (1,400 pairs)

First stable release with comprehensive evaluation across 9 categories.

Evaluation Results (v0.6)

Evaluated on 337 questions across 9 categories:

Category	Questions	Score	Accuracy
Parameter Validation	15	15.0/15	100%
Position Sizing Math	60	59.0/60	98.3%
General Capability	55	54.0/55	98.2%
Edge Cases	40	38.0/40	95.0%
Adversarial Percentage	100	93.0/100	93.0%
Multi-step Reasoning	30	27.7/30	92.3%
Trading Mechanics	10	9.0/10	90.0%
Factual	15	5.0/15	33.3%
API Structure	12	3.3/12	27.5%
Overall	337	304.0/337	90.2%

Evaluation Methodology

v0.6 introduces a robust evaluation system with question-aware extraction:

Question Value Exclusion - Parser identifies all numeric values in the question (dollar amounts, percentages, leverage) and excludes them from answer extraction
Multi-Stage Extraction - Prioritizes JSON blocks > Final Answer sections > Explicit markers > Context-aware patterns
Confidence Scoring - Each extraction includes confidence scores for quality assurance

This fixes the 17% false negative rate (53 extraction bugs) that affected earlier evaluations.

Training Configuration

LoRA Parameters

{
    "r": 64,
    "lora_alpha": 128,
    "lora_dropout": 0.05,
    "target_modules": ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
    "use_rslora": True,
    "use_dora": True
}

SFT Hyperparameters

{
    "learning_rate": 1e-5,
    "epochs": 5,  # With early stopping
    "batch_size": 4,
    "gradient_accumulation_steps": 2,
    "warmup_ratio": 0.10,
    "max_length": 4096
}

DPO Hyperparameters

{
    "beta": 0.05,
    "learning_rate": 5e-7,
    "epochs": 2,
    "batch_size": 4,
    "gradient_accumulation_steps": 2
}

Training Data Distribution

SFT (~6,700 examples):

Category	Examples	%
General Instruction	1,200	17.9%
Position Sizing	800	11.9%
Parameter Validation	700	10.4%
Edge Cases	600	9.0%
API Structure (Enhanced)	500	7.5%
Adversarial Percentages	500	7.5%
Answer Format	400	6.0%
Trading Mechanics	400	6.0%
Leverage/Percentage	300	4.5%
Factual	300	4.5%
Cod3x Reasoning	300	4.5%
Other	700	10.4%

DPO (~1,800 pairs):

Failure Mode	Pairs	%
Percentage Confusion	400	22.2%
API Structure	250	13.9%
Excessive Leverage	250	13.9%
Final Answer	200	11.1%
Risk Violation	200	11.1%
Leverage Calculation	150	8.3%
Position Sizing	150	8.3%
Policy Bypass	100	5.6%
Uncertainty Caution	100	5.6%

Usage

With Transformers + PEFT

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen3-4B-Instruct-2507",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "UVLabs/HyperLLM-4b")
tokenizer = AutoTokenizer.from_pretrained("UVLabs/HyperLLM-4b")

# Generate
messages = [{"role": "user", "content": "Calculate position size for $10,000 account, 2% risk, entry $50, stop loss $48"}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

With 4-bit Quantization (Low VRAM)

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
)

base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen3-4B-Instruct-2507",
    quantization_config=bnb_config,
    device_map="auto",
)

model = PeftModel.from_pretrained(base_model, "UVLabs/HyperLLM-4b")
tokenizer = AutoTokenizer.from_pretrained("UVLabs/HyperLLM-4b")

Example Use Cases

Position Sizing

User: I have a $50,000 account and want to risk 2% on an ETH trade.
      Entry at $3,450, stop loss at $3,400. What's my position size?

Model: Risk amount = $50,000 * 2% = $1,000
       Stop distance = $3,450 - $3,400 = $50
       Position size = $1,000 / $50 = 20 ETH

       **Final Answer: 20 ETH**

Leverage Calculation

User: Account $10,000. Position $50,000. What's the leverage?

Model: Leverage = Position Value / Account Balance
       Leverage = $50,000 / $10,000 = 5x

       **Final Answer: 5x**

Parameter Validation

User: Is a position size of 1.234 valid for ETH (szDecimals=2)?

Model: No, 1.234 has 3 decimal places but ETH only allows 2 (szDecimals=2).
       Valid sizes would be 1.23 or 1.24.

Adversarial Percentage Handling

User: Increase my position by 50%

Model: I need clarification. "Increase by 50%" could mean:
       1. Add 50% of current position (1.5x total)
       2. Increase to 50% of account (absolute)
       Which do you mean?

Limitations

Factual Knowledge: 33.3% accuracy - specific Hyperliquid URLs and fee structures need improvement
API Structure: 27.5% accuracy - exact JSON field names still challenging

Hardware Requirements

Mode	VRAM	Notes
bfloat16	~10GB	Full precision inference
4-bit	~4GB	Quantized inference
8-bit	~6GB	INT8 quantization

Training Hardware

Hardware: NVIDIA A100 80GB SXM
SFT Duration: ~25 minutes
DPO Duration: ~20 minutes
Total Cost: ~$2.00 (RunPod)

Framework Versions

PEFT: 0.18.1
TRL: 0.29.0
Transformers: 5.2.0
PyTorch: 2.10.0

License

Apache 2.0

Citation

@misc{hyperllm2026,
  title={HyperLLM: A Specialized LLM for Hyperliquid Trading},
  author={UVLabs},
  year={2026},
  url={https://huggingface.co/UVLabs/HyperLLM-4b}
}

Downloads last month: 149

Model tree for UVLabs/HyperLLM-4b

Base model

Qwen/Qwen3-4B-Instruct-2507

Adapter

(5272)

this model