Is Safety Standard Same for Everyone? User-Specific Safety Evaluation of Large Language Models Paper • 2502.15086 • Published Feb 20, 2025 • 16
How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM? Paper • 2502.14502 • Published Feb 20, 2025 • 91
Does Time Have Its Place? Temporal Heads: Where Language Models Recall Time-specific Information Paper • 2502.14258 • Published Feb 20, 2025 • 26
S^2R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning Paper • 2502.12853 • Published Feb 18, 2025 • 29
Small Models Struggle to Learn from Strong Reasoners Paper • 2502.12143 • Published Feb 17, 2025 • 39
Is That Your Final Answer? Test-Time Scaling Improves Selective Question Answering Paper • 2502.13962 • Published Feb 19, 2025 • 28
Language Models' Factuality Depends on the Language of Inquiry Paper • 2502.17955 • Published Feb 25, 2025 • 32
Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning? Paper • 2502.19361 • Published Feb 26, 2025 • 28
Can Language Models Falsify? Evaluating Algorithmic Reasoning with Counterexample Creation Paper • 2502.19414 • Published Feb 26, 2025 • 20