--- license: apache-2.0 base_model: EleutherAI/pythia-1.4b tags: - generated_from_trainer - sft - ultrafeedback datasets: - trl-lib/tldr language: - en library_name: transformers --- # pythia-1.4b Fine-tuned on tldr This model is a fine-tuned version of [EleutherAI/pythia-1.4b](https://huggingface.co/EleutherAI/pythia-1.4b) on the [trl-lib/tldr](https://huggingface.co/datasets/trl-lib/tldr) dataset. ## Training Results ![Training Loss](loss_plot.png) ### Training Statistics | Metric | Value | |--------|-------| | Total Steps | 1356 | | Final Training Loss | 147.1650 | | Min Training Loss | 2.8189 | | Training Runtime | 347.80 seconds | | Samples/Second | 249.34 | ## Training Configuration | Parameter | Value | |-----------|-------| | Base Model | EleutherAI/pythia-1.4b | | Dataset | trl-lib/tldr | | Number of Epochs | 1.0 | | Per Device Batch Size | 16 | | Gradient Accumulation Steps | 1 | | Total Batch Size | 64 (4 GPUs) | | Learning Rate | 2e-05 | | LR Scheduler | cosine | | Warmup Ratio | 0.1 | | Max Sequence Length | 512 | | Optimizer | adamw_torch_fused | | Mixed Precision | BF16 | ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "activeDap/pythia-1.4b_tldr" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name) # Format input with prompt template prompt = "What is machine learning?\nAssistant:" inputs = tokenizer(prompt, return_tensors="pt") # Generate response outputs = model.generate(**inputs, max_new_tokens=100) response = tokenizer.decode(outputs[0], skip_special_tokens=True) print(response) ``` ## Training Framework - **Library:** Transformers + TRL - **Training Type:** Supervised Fine-Tuning (SFT) - **Format:** Prompt-completion with Assistant-only loss ## Citation If you use this model, please cite the original base model and dataset: ```bibtex @misc{ultrafeedback2023, title={UltraFeedback: Boosting Language Models with High-quality Feedback}, author={Ganqu Cui and Lifan Yuan and Ning Ding and others}, year={2023}, eprint={2310.01377}, archivePrefix={arXiv} } ```