Llama for Finance (LoRA)

A financial-domain instruction-tuned LoRA adapter for meta-llama/Meta-Llama-3.1-8B-Instruct, trained with length-aware batching and an English-only heuristic.

Model Details

  • Base model: meta-llama/Meta-Llama-3.1-8B-Instruct
  • Adapter type: LoRA (PEFT)
  • Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • LoRA hyperparams: r=64, alpha=128, dropout=0.1, bias=none
  • Precision: fp16 (bf16 when available); gradient checkpointing on
  • Length bucketing: enabled (group_by_length=True, boundaries 512/1024/1536/2048)
  • Context length: up to 2048 tokens
  • Language: English (non-English filtered via ASCII-ratio heuristic)

Training Data & Filtering

  • Source dataset: Josephgflowers/Finance-Instruct-500k
  • Sampling caps: max_train_samples=25k, max_val_samples=2.5k after filtering
  • Chat formatting: preformatted text field with system/user/assistant turns
  • Filters:
    • drop rows without text
    • English-only heuristic (min_english_ratio≈0.85, min_chars_for_lang_check=40)
    • EOS enforced at end of samples

Training Setup

  • Epochs: 2
  • Batching: per-device train 16, grad accumulation 4 (effective 64); eval batch 8
  • Optimizer: paged_adamw_8bit
  • LR / schedule: 1e-4, cosine, warmup_ratio 0.05
  • Regularization: weight_decay 0.01, max_grad_norm 1.0
  • Eval/save: eval_steps=50, save_steps=100 (load_best_model_at_end=True)
  • Length-aware sampler: custom bucket sampler reduces padding waste

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base = "meta-llama/Meta-Llama-3.1-8B-Instruct"
adapter = "TimberGu/Llama_for_Finance"

tokenizer = AutoTokenizer.from_pretrained(adapter)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"  # matches training setup

dtype = torch.bfloat16 if torch.cuda.is_bf16_supported() else torch.float16
base_model = AutoModelForCausalLM.from_pretrained(base, dtype=dtype, device_map="auto")
model = PeftModel.from_pretrained(base_model, adapter)
model.eval()

prompt = "Explain what a yield curve inversion implies for equities."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=256, temperature=0.8, top_p=0.9)
print(tokenizer.decode(out[0], skip_special_tokens=True))

Evaluation

  • Held-out validation (eval_50_gpt_judged_raw.jsonl): eval_loss ≈1.05 over 2 epochs. No public benchmark beyond the filtered split.

Limitations & Risks

  • Domain-focused on finance/economics; may underperform on general tasks.
  • English-centric; non-English input was filtered during training.
  • Hallucinations remain possible—do not use for financial advice without human review.

Files

  • adapter_model.safetensors, adapter_config.json: LoRA weights/config
  • tokenizer.json, tokenizer_config.json, special_tokens_map.json, chat_template.jinja
  • training_config.json, training_args.bin, test_results.json
Downloads last month
40
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for TimberGu/Llama_for_Finance

Adapter
(1317)
this model