Zewen Chi's picture

Zewen Chi

CZWin32768

·

https://www.microsoft.com/en-us/research/people/zewenchi/

AI & ML interests

None yet

Recent Activity

upvoted a paper 7 days ago

Online Experiential Learning for Language Models

upvoted a paper about 1 month ago

Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

upvoted a paper about 1 month ago

EgoActor: Grounding Task Planning into Spatial-aware Egocentric Actions for Humanoid Robots via Visual-Language Models

View all activity

Organizations

None yet

upvoted a paper 7 days ago

Online Experiential Learning for Language Models

Paper • 2603.16856 • Published 8 days ago • 55

upvoted 3 papers about 1 month ago

Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Paper • 2509.07980 • Published Sep 9, 2025 • 105

EgoActor: Grounding Task Planning into Spatial-aware Egocentric Actions for Humanoid Robots via Visual-Language Models

Paper • 2602.04515 • Published Feb 4 • 39

MSign: An Optimizer Preventing Training Instability in Large Language Models via Stable Rank Restoration

Paper • 2602.01734 • Published Feb 2 • 32

upvoted a collection 4 months ago

🧠 Reasoning datasets

Datasets with reasoning traces for math and code released by the community • 24 items • Updated May 19, 2025 • 185

upvoted a paper 4 months ago

From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence

Paper • 2511.18538 • Published Nov 23, 2025 • 304

upvoted 4 papers 5 months ago

The End of Manual Decoding: Towards Truly End-to-End Language Models

Paper • 2510.26697 • Published Oct 30, 2025 • 119

The Era of Agentic Organization: Learning to Organize with Language Models

Paper • 2510.26658 • Published Oct 30, 2025 • 29

AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders

Paper • 2510.19779 • Published Oct 22, 2025 • 62

QueST: Incentivizing LLMs to Generate Difficult Problems

Paper • 2510.17715 • Published Oct 20, 2025 • 35

upvoted 3 papers 10 months ago

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9, 2025 • 265

On-Policy RL with Optimal Reward Baseline

Paper • 2505.23585 • Published May 29, 2025 • 14

Reward Reasoning Model

Paper • 2505.14674 • Published May 20, 2025 • 37