HyperLLM-4b v0.6

A specialized 4B parameter language model fine-tuned for Hyperliquid perpetual DEX trading assistance. Built on Qwen3-4B-Instruct using LoRA + DPO training.

Model Description

HyperLLM is designed to assist with:

  • Position sizing calculations - Risk-based position sizing with proper decimal handling
  • API structure understanding - Hyperliquid exchange API request/response formats
  • Trading mechanics - Perpetual futures concepts, margin modes, order types
  • Parameter validation - Validating trade parameters against exchange constraints
  • Edge case handling - Boundary conditions and unusual trading scenarios

Version History

v0.6 (Current - March 18, 2026)

Training Pipeline: SFT (6,700 examples) + DPO (1,800 preference pairs)

v0.6 is a recovery release that fixes evaluation extraction bugs and includes targeted training improvements.

Key Changes from v0.5:

Change v0.5 v0.6 Impact
SFT Dataset Size 14,260 ~6,700 Less dilution, more focused
General Instructions 5,711 1,200 Reduced interference
Adversarial DPO Pairs Diluted 2:1 Doubled (400) Better % handling
Market Knowledge Added Removed Cleaner, more precise
Answer Format None Enforced Better extraction

Major Improvements over v0.4:

Category v0.4 v0.6 Change
Overall 75.0% 90.2% +15.2%
Adversarial % 71.0% 93.0% +22.0%
Multi-step 32.0% 92.3% +60.3%
Position Sizing 81.7% 98.3% +16.6%
Edge Cases 90.0% 95.0% +5.0%
General Capability 96.4% 98.2% +1.8%
Trading Mechanics 80.0% 90.0% +10.0%
Parameter Validation 100% 100% Maintained

Note: v0.6 results reflect corrected evaluation scoring after fixing an extraction bug that was grabbing question values instead of computed answers.

v0.5 (March 16, 2026)

Training Pipeline: SFT (14,260 examples) + DPO (3,057 pairs)

Issues: Dataset dilution caused -4.4% regression from v0.4. Doubled general instructions interfered with specialized training.

v0.4 (March 11, 2026)

Training Pipeline: SFT (6,782 examples) + DPO (1,400 pairs)

Established baseline with strong adversarial percentage handling (71%) and 100% parameter validation.

v0.3 (March 6, 2026)

Training Pipeline: SFT (7,028 examples) + DPO (1,400 pairs)

First stable release with comprehensive evaluation across 9 categories.

Evaluation Results (v0.6)

Evaluated on 337 questions across 9 categories:

Category Questions Score Accuracy
Parameter Validation 15 15.0/15 100%
Position Sizing Math 60 59.0/60 98.3%
General Capability 55 54.0/55 98.2%
Edge Cases 40 38.0/40 95.0%
Adversarial Percentage 100 93.0/100 93.0%
Multi-step Reasoning 30 27.7/30 92.3%
Trading Mechanics 10 9.0/10 90.0%
Factual 15 5.0/15 33.3%
API Structure 12 3.3/12 27.5%
Overall 337 304.0/337 90.2%

Evaluation Methodology

v0.6 introduces a robust evaluation system with question-aware extraction:

  1. Question Value Exclusion - Parser identifies all numeric values in the question (dollar amounts, percentages, leverage) and excludes them from answer extraction
  2. Multi-Stage Extraction - Prioritizes JSON blocks > Final Answer sections > Explicit markers > Context-aware patterns
  3. Confidence Scoring - Each extraction includes confidence scores for quality assurance

This fixes the 17% false negative rate (53 extraction bugs) that affected earlier evaluations.

Training Configuration

LoRA Parameters

{
    "r": 64,
    "lora_alpha": 128,
    "lora_dropout": 0.05,
    "target_modules": ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
    "use_rslora": True,
    "use_dora": True
}

SFT Hyperparameters

{
    "learning_rate": 1e-5,
    "epochs": 5,  # With early stopping
    "batch_size": 4,
    "gradient_accumulation_steps": 2,
    "warmup_ratio": 0.10,
    "max_length": 4096
}

DPO Hyperparameters

{
    "beta": 0.05,
    "learning_rate": 5e-7,
    "epochs": 2,
    "batch_size": 4,
    "gradient_accumulation_steps": 2
}

Training Data Distribution

SFT (~6,700 examples):

Category Examples %
General Instruction 1,200 17.9%
Position Sizing 800 11.9%
Parameter Validation 700 10.4%
Edge Cases 600 9.0%
API Structure (Enhanced) 500 7.5%
Adversarial Percentages 500 7.5%
Answer Format 400 6.0%
Trading Mechanics 400 6.0%
Leverage/Percentage 300 4.5%
Factual 300 4.5%
Cod3x Reasoning 300 4.5%
Other 700 10.4%

DPO (~1,800 pairs):

Failure Mode Pairs %
Percentage Confusion 400 22.2%
API Structure 250 13.9%
Excessive Leverage 250 13.9%
Final Answer 200 11.1%
Risk Violation 200 11.1%
Leverage Calculation 150 8.3%
Position Sizing 150 8.3%
Policy Bypass 100 5.6%
Uncertainty Caution 100 5.6%

Usage

With Transformers + PEFT

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen3-4B-Instruct-2507",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "UVLabs/HyperLLM-4b")
tokenizer = AutoTokenizer.from_pretrained("UVLabs/HyperLLM-4b")

# Generate
messages = [{"role": "user", "content": "Calculate position size for $10,000 account, 2% risk, entry $50, stop loss $48"}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

With 4-bit Quantization (Low VRAM)

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
)

base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen3-4B-Instruct-2507",
    quantization_config=bnb_config,
    device_map="auto",
)

model = PeftModel.from_pretrained(base_model, "UVLabs/HyperLLM-4b")
tokenizer = AutoTokenizer.from_pretrained("UVLabs/HyperLLM-4b")

Example Use Cases

Position Sizing

User: I have a $50,000 account and want to risk 2% on an ETH trade.
      Entry at $3,450, stop loss at $3,400. What's my position size?

Model: Risk amount = $50,000 * 2% = $1,000
       Stop distance = $3,450 - $3,400 = $50
       Position size = $1,000 / $50 = 20 ETH

       **Final Answer: 20 ETH**

Leverage Calculation

User: Account $10,000. Position $50,000. What's the leverage?

Model: Leverage = Position Value / Account Balance
       Leverage = $50,000 / $10,000 = 5x

       **Final Answer: 5x**

Parameter Validation

User: Is a position size of 1.234 valid for ETH (szDecimals=2)?

Model: No, 1.234 has 3 decimal places but ETH only allows 2 (szDecimals=2).
       Valid sizes would be 1.23 or 1.24.

Adversarial Percentage Handling

User: Increase my position by 50%

Model: I need clarification. "Increase by 50%" could mean:
       1. Add 50% of current position (1.5x total)
       2. Increase to 50% of account (absolute)
       Which do you mean?

Limitations

  • Factual Knowledge: 33.3% accuracy - specific Hyperliquid URLs and fee structures need improvement
  • API Structure: 27.5% accuracy - exact JSON field names still challenging

Hardware Requirements

Mode VRAM Notes
bfloat16 ~10GB Full precision inference
4-bit ~4GB Quantized inference
8-bit ~6GB INT8 quantization

Training Hardware

  • Hardware: NVIDIA A100 80GB SXM
  • SFT Duration: ~25 minutes
  • DPO Duration: ~20 minutes
  • Total Cost: ~$2.00 (RunPod)

Framework Versions

  • PEFT: 0.18.1
  • TRL: 0.29.0
  • Transformers: 5.2.0
  • PyTorch: 2.10.0

License

Apache 2.0

Citation

@misc{hyperllm2026,
  title={HyperLLM: A Specialized LLM for Hyperliquid Trading},
  author={UVLabs},
  year={2026},
  url={https://huggingface.co/UVLabs/HyperLLM-4b}
}
Downloads last month
149
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for UVLabs/HyperLLM-4b

Adapter
(5272)
this model