AI & ML interests
None defined yet.
Recent Activity
MultiRL/qwen3_4b_base_sft_final
4B
•
Updated
•
75
MultiRL/qwen3_4b_easy_rl_new
4B
•
Updated
•
74
MultiRL/qwen3_1.7b_easy_rl_gspo
2B
•
Updated
•
4
4B
•
Updated
•
52
MultiRL/qwen3_1.7b_easy_rl_final_step120
2B
•
Updated
•
237
MultiRL/qwen3_4b_medium_rl_final
4B
•
Updated
•
167
MultiRL/qwen3_4b_sft_one_act
4B
•
Updated
•
54
MultiRL/qwen3_1.7b_easy_rl_reinforce_ori
2B
•
Updated
•
89
MultiRL/qwen3_1.7b_easy_rl_reinforce_alpha_0.5
2B
•
Updated
•
4
MultiRL/qwen3_1.7b_easy_rl_reinforce_alpha_1
2B
•
Updated
•
4
MultiRL/qwen3_1.7b_easy_rl_reinforce_alpha_0
2B
•
Updated
•
3
MultiRL/qwen3_1.7b_sft_one_act
2B
•
Updated
•
99
MultiRL/qwen3_1.7b_easy_rl_final
2B
•
Updated
•
866
MultiRL/qwen3_4b_easy_rl_final
4B
•
Updated
•
56
MultiRL/qwen3_1.7b_sft_final
2B
•
Updated
•
2.91k
MultiRL/qwen3_4b_sft_final
4B
•
Updated
•
77
MultiRL/qwen3_1.7b_easy_rl_new
2B
•
Updated
•
1
MultiRL/qwen3_4b_standard_medium_rl
4B
•
Updated
•
43
MultiRL/qwen3_4b_standard_easy_rl
4B
•
Updated
•
46
MultiRL/qwen3_4b_medium_rl_progress_C
MultiRL/qwen3_4b_medium_rl
4B
•
Updated
•
42
MultiRL/qwen3_4b_instruct_sft
4B
•
Updated
•
57
MultiRL/qwen3_1.7b_easy_rl_test_task_group
MultiRL/qwen3_1.7b_easy_rl_test
2B
•
Updated
•
36
MultiRL/qwen3_1.7b_sudoku_sft
2B
•
Updated
•
99
MultiRL/qwen3_1.7b_easy_reinforce_batch_32_by_pass
2B
•
Updated
•
10
MultiRL/qwen3_1.7b_easy_reinforce_batch_64_by_pass
2B
•
Updated
MultiRL/qwen3_1.7b_easy_reinforce_test
2B
•
Updated
MultiRL/qwen3_1.7b_C_easy_gspo_test
2B
•
Updated
•
1
MultiRL/qwen3_1.7b_base_C_normal_short_sft_lr_1e_5_C_easy_grpo_step70
2B
•
Updated
•
1