TMLR-Group-HF/Self-Certainty-Qwen3-1.7B-Base-MATH Text Generation • 2B • Updated Oct 11, 2025 • 10 • 1
TMLR-Group-HF/Self-Certainty-Llama-3.2-3B-Instruct-MATH Text Generation • 4B • Updated Oct 11, 2025 • 5
TMLR-Group-HF/Co-rewarding-I-Llama-3.2-3B-Instruct-MATH Text Generation • 4B • Updated Oct 11, 2025 • 9
TMLR-Group-HF/Self-Certainty-Qwen2.5-7B-MATH Text Generation • 8B • Updated Oct 11, 2025 • 4 • 1
TMLR-Group-HF/Co-rewarding-I-Qwen3-8B-Base-MATH Text Generation • 8B • Updated Oct 11, 2025 • 12 • 1
TMLR-Group-HF/Co-rewarding-I-Qwen3-4B-Base-MATH Text Generation • 4B • Updated Oct 11, 2025 • 11 • 1
TMLR-Group-HF/Co-rewarding-II-Qwen2.5-7B-MATH Text Generation • 8B • Updated Oct 11, 2025 • 12 • 1
TMLR-Group-HF/Co-rewarding-I-Qwen3-1.7B-Base-MATH Text Generation • 2B • Updated Oct 11, 2025 • 9
TMLR-Group-HF/Self-Certainty-Qwen3-8B-Base-MATH Text Generation • 8B • Updated Oct 11, 2025 • 7 • 1
TMLR-Group-HF/Co-rewarding-I-Qwen2.5-7B-MATH Text Generation • 8B • Updated Oct 11, 2025 • 12 • 1
TMLR-Group-HF/Entropy-Qwen3-8B-Base-OpenRS Text Generation • 8B • Updated Oct 11, 2025 • 5 • 1
TMLR-Group-HF/Entropy-Qwen3-8B-Base-DAPO14k Text Generation • 8B • Updated Oct 11, 2025 • 17 • 1
TMLR-Group-HF/Co-rewarding-I-Qwen2.5-3B-MATH Text Generation • 3B • Updated Oct 11, 2025 • 9 • 1