Efficient RLVR Training via Weighted Mutual Information Data Selection Paper • 2603.01907 • Published 8 days ago • 14
Efficient RLVR Training via Weighted Mutual Information Data Selection Paper • 2603.01907 • Published 8 days ago • 14
LoGAH: Predicting 774-Million-Parameter Transformers using Graph HyperNetworks with 1/100 Parameters Paper • 2405.16287 • Published May 25, 2024 • 11
CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models Paper • 2602.17684 • Published Feb 4 • 22
Efficient RLVR Training via Weighted Mutual Information Data Selection Paper • 2603.01907 • Published 8 days ago • 14
CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models Paper • 2602.17684 • Published Feb 4 • 22
The Station: An Open-World Environment for AI-Driven Discovery Paper • 2511.06309 • Published Nov 9, 2025 • 37
Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR Paper • 2508.14029 • Published Aug 19, 2025 • 118
LoGAH: Predicting 774-Million-Parameter Transformers using Graph HyperNetworks with 1/100 Parameters Paper • 2405.16287 • Published May 25, 2024 • 11