Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
5
Xiaoyang Cao
Sean13
Follow
0 followers
·
2 following
https://xiaoyangcao1113.github.io/
XiaoyangCao1113
xiaoyangcao
AI & ML interests
RLFH, Deep Reinfrocement Learning
Recent Activity
updated
a model
10 days ago
Sean13/grpo_nocurriculum_Qwen3-1.7B-100step
published
a model
10 days ago
Sean13/grpo_nocurriculum_Qwen3-1.7B-100step
updated
a model
10 days ago
Sean13/maxrl_nocurriculum_Qwen3-1.7B-100step
View all activity
Organizations
None yet
models
72
Sort: Recently updated
Sean13/grpo_nocurriculum_Qwen3-1.7B-100step
Reinforcement Learning
•
2B
•
Updated
10 days ago
•
11
Sean13/maxrl_nocurriculum_Qwen3-1.7B-100step
Reinforcement Learning
•
2B
•
Updated
10 days ago
•
15
Sean13/maxrl_curriculum_Qwen3-1.7B-200step
Reinforcement Learning
•
2B
•
Updated
11 days ago
•
16
Sean13/role-drift-compound-systems
Updated
14 days ago
Sean13/maxrl_curriculum_Qwen3-1.7B
2B
•
Updated
14 days ago
•
17
Sean13/grpo_curriculum_Qwen3-1.7B
2B
•
Updated
14 days ago
•
32
Sean13/repo-best-llama-re-dpo
Updated
Feb 26
Sean13/repo-best-llama-dpo
Updated
Feb 26
Sean13/repo-best-mistral-dpo
Updated
Feb 26
Sean13/repo-best-mistral-re-dpo
Updated
Feb 26
View 72 models
datasets
0
None public yet