Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up
Xiaoyang Cao's picture
5

Xiaoyang Cao

Sean13
·
https://xiaoyangcao1113.github.io/
  • XiaoyangCao1113
  • xiaoyangcao

AI & ML interests

RLFH, Deep Reinfrocement Learning

Recent Activity

updated a model 10 days ago
Sean13/grpo_nocurriculum_Qwen3-1.7B-100step
published a model 10 days ago
Sean13/grpo_nocurriculum_Qwen3-1.7B-100step
updated a model 10 days ago
Sean13/maxrl_nocurriculum_Qwen3-1.7B-100step
View all activity

Organizations

None yet

models 72

Sean13/grpo_nocurriculum_Qwen3-1.7B-100step

Reinforcement Learning • 2B • Updated 10 days ago • 11

Sean13/maxrl_nocurriculum_Qwen3-1.7B-100step

Reinforcement Learning • 2B • Updated 10 days ago • 15

Sean13/maxrl_curriculum_Qwen3-1.7B-200step

Reinforcement Learning • 2B • Updated 11 days ago • 16

Sean13/role-drift-compound-systems

Updated 14 days ago

Sean13/maxrl_curriculum_Qwen3-1.7B

2B • Updated 14 days ago • 17

Sean13/grpo_curriculum_Qwen3-1.7B

2B • Updated 14 days ago • 32

Sean13/repo-best-llama-re-dpo

Updated Feb 26

Sean13/repo-best-llama-dpo

Updated Feb 26

Sean13/repo-best-mistral-dpo

Updated Feb 26

Sean13/repo-best-mistral-re-dpo

Updated Feb 26
View 72 models

datasets 0

None public yet
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs