Shuang Qiu's picture

1 3

Shuang Qiu

shq-ml

AI & ML interests

None yet

Organizations

None yet

authored 4 papers 7 months ago

On the Value of Myopic Behavior in Policy Reuse

Paper • 2305.17623 • Published May 28, 2023

Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment

Paper • 2402.10207 • Published Feb 15, 2024 • 3

Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards

Paper • 2402.18571 • Published Feb 28, 2024

Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models

Paper • 2505.23564 • Published May 29 • 3