Chess Commentary T5 Model
This repository hosts a fine-tuned T5-small model that generates short, human-style commentary for chess moves from a structured input string.
The project goal is to produce commentary that sounds natural while still staying grounded in engine-derived chess information, and to be able to produce better and more accurate commentary than general purpose LLMs. Instead of asking the model to infer everything directly from raw board state, the training pipeline supplies a compact text prompt containing:
- the position before the move (
before_fen) - the move played (
move_uci) - the side to move (
mover) - a serialized
featuresfield produced by the project’s feature extraction pipeline
What this model does
Given an input like:
task: chess_commentary
before_fen: r3k2r/pp1q3p/6pb/2ppPp2/1n1Pn3/2NNBQ2/PPP3PP/R2K3R b kq - 3 13
move_uci: c5d4
mover: black
features: -144 -88 56 74 37.047 41.970 0 0 0 0 0 NA 0 0 3 6 bishop bishop -1 1 0 0 0 0 NA NA NA
the model generates a short commentary sentence such as:
Black captures on d4 and wins material.
Model details
- Base model:
t5-small - Task: chess move commentary generation
- Framework: Hugging Face Transformers
- Training approach: supervised fine-tuning on structured chess inputs paired with commentary targets
- Commentary style: short, natural-language move commentary grounded in evaluation/features rather than freeform narration
Input format
The model expects a single plain-text prompt in the following shape:
task: chess_commentary
before_fen: <FEN before the move>
move_uci: <move in UCI format>
mover: <white|black>
features: <whitespace-separated serialized feature vector>
Required fields
task— task tag used during trainingbefore_fen— board position before the movemove_uci— move in UCI notationmover— side that played the movefeatures— whitespace-separated numeric/categorical feature vector generated by the project pipeline
Exact features field / feature order
The released model expects the features text exactly in the same serialized order used during training.
Because the checkpoint was trained on a single whitespace-separated features string, the safest and correct way to prepare inputs is:
- run the same
FeatureExtractorused in the training pipeline - serialize the resulting values in the same order used by the training-data generation script
- place that serialized string after
features:
Important note
This public model card does not rename or reorder the features internally. The model only sees the raw text sequence.
So the authoritative feature order is the one defined by your project code that created the training data.
From the training examples used for this model, the serialized vector contains a mix of:
- evaluation-related values
- move-quality / loss-style values
- tactical flags
- hanging/capture/trade descriptors
- piece identity fields
- additional categorical placeholders such as
NA
Example serialized feature vector
-144 -88 56 74 37.047 41.970 0 0 0 0 0 NA 0 0 3 6 bishop bishop -1 1 0 0 0 0 NA NA NA
If you want to reproduce inputs exactly
Use the same local pipeline that generated the training JSONL. In this project that means the feature extractor / preprocessing scripts used to build the final training_data.jsonl.
If your local code exposes named feature fields, a good idea is to paste that exact ordered list here before final publication. For now, this model card intentionally preserves the serialized format contract the model was trained on.
Quick usage
Install dependencies:
pip install transformers torch sentencepiece
Load the model:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
repo_id = "YOUR_USERNAME/chess-commentary-t5"
tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForSeq2SeqLM.from_pretrained(repo_id)
text = """task: chess_commentary
before_fen: r3k2r/pp1q3p/6pb/2ppPp2/1n1Pn3/2NNBQ2/PPP3PP/R2K3R b kq - 3 13
move_uci: c5d4
mover: black
features: -144 -88 56 74 37.047 41.970 0 0 0 0 0 NA 0 0 3 6 bishop bishop -1 1 0 0 0 0 NA NA NA
"""
inputs = tokenizer(text, return_tensors="pt", truncation=True)
outputs = model.generate(
**inputs,
max_new_tokens=64,
num_beams=4,
do_sample=False,
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Intended use
This model is intended for:
- project demos
- model comparison experiments
- human-preference studies for chess commentary
- showcasing how structured chess signals can guide natural-language generation
Limitations
- The model does not compute chess features on its own.
- It assumes the input format matches the training-time format.
- If the
featuresfield is missing, reordered, or generated differently, output quality may degrade sharply. - The model is meant for commentary generation, not authoritative analysis.
Authors
Jack Chang, Nguyen Hoang, Kristal Hong,
- Downloads last month
- 50