GradES: Significantly Faster Training in Transformers with Gradient-Based Early Stopping Paper • 2509.01842 • Published Sep 1, 2025 • 5