activeDap
/

Qwen2.5-7B_tldr

Text Generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions

Qwen2.5-7B_tldr / README.md

BootsofLagrangian's picture

BootsofLagrangian

Upload folder using huggingface_hub

b06a51c verified about 1 month ago

|

history blame contribute delete

2.13 kB

	---
	license: apache-2.0
	base_model: Qwen/Qwen2.5-7B
	tags:
	- generated_from_trainer
	- sft
	- ultrafeedback
	datasets:
	- trl-lib/tldr
	language:
	- en
	library_name: transformers
	---

	# Qwen2.5-7B Fine-tuned on tldr

	This model is a fine-tuned version of [Qwen/Qwen2.5-7B](https://huggingface.co/Qwen/Qwen2.5-7B) on the [trl-lib/tldr](https://huggingface.co/datasets/trl-lib/tldr) dataset.

	## Training Results

	![Training Loss](loss_plot.png)

	### Training Statistics

	\| Metric \| Value \|
	\|--------\|-------\|
	\| Total Steps \| 1312 \|
	\| Final Training Loss \| 2.2743 \|
	\| Min Training Loss \| 2.2423 \|
	\| Training Runtime \| 1363.49 seconds \|
	\| Samples/Second \| 61.55 \|

	## Training Configuration

	\| Parameter \| Value \|
	\|-----------\|-------\|
	\| Base Model \| Qwen/Qwen2.5-7B \|
	\| Dataset \| trl-lib/tldr \|
	\| Number of Epochs \| 1.0 \|
	\| Per Device Batch Size \| 16 \|
	\| Gradient Accumulation Steps \| 1 \|
	\| Total Batch Size \| 64 (4 GPUs) \|
	\| Learning Rate \| 2e-05 \|
	\| LR Scheduler \| cosine \|
	\| Warmup Ratio \| 0.1 \|
	\| Max Sequence Length \| 512 \|
	\| Optimizer \| adamw_torch_fused \|
	\| Mixed Precision \| BF16 \|

	## Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_name = "activeDap/Qwen2.5-7B_tldr"

	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(model_name)

	# Format input with prompt template
	prompt = "What is machine learning?\nAssistant:"
	inputs = tokenizer(prompt, return_tensors="pt")

	# Generate response
	outputs = model.generate(**inputs, max_new_tokens=100)
	response = tokenizer.decode(outputs[0], skip_special_tokens=True)
	print(response)
	```

	## Training Framework

	- Library: Transformers + TRL
	- Training Type: Supervised Fine-Tuning (SFT)
	- Format: Prompt-completion with Assistant-only loss

	## Citation

	If you use this model, please cite the original base model and dataset:

	```bibtex
	@misc{ultrafeedback2023,
	title={UltraFeedback: Boosting Language Models with High-quality Feedback},
	author={Ganqu Cui and Lifan Yuan and Ning Ding and others},
	year={2023},
	eprint={2310.01377},
	archivePrefix={arXiv}
	}
	```