GD-ML
/

Qwen2.5-Math-7B-GPG

Model card Files Files and versions

xiao23451 commited on Apr 30

Commit

341f780

·

verified ·

1 Parent(s): 458bf05

Update README.md

Files changed (1) hide show

README.md +20 -3

README.md CHANGED Viewed

@@ -1,3 +1,20 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+base_model:
+- Qwen/Qwen2.5-Math-7B
+---
+## Model ID
+GPG: A Simple and Strong Reinforcement Learning Baseline for Model Reasoning
+https://arxiv.org/abs/2504.02546
+## Model Details
+The RL model (GPG-7B in paper) trained on the simple1r_qwen_level3to5 dataset based on GPG, using Qwen2.5-Math-7B as the baseline model.
+## Attention!
+Due to changes in environment and devices, test results may fluctuate. Specifically, when tested on an NPU, the average accuracy of five datasets (AIME24, AMC23, MATH-500, Minerva and OlympiadBench) is 57.7. However, when tested on an H20 GPU, the average accuracy drops from 57.7 to 55.3. These fluctuations are entirely within an acceptable range.