Clinical knowledge in LLMs does not translate to human interactions Paper • 2504.18919 • Published Apr 26, 2025 • 26
LINGOLY-TOO: Disentangling Memorisation from Reasoning with Linguistic Templatisation and Orthographic Obfuscation Paper • 2503.02972 • Published Mar 4, 2025 • 25
Ablation is Not Enough to Emulate DPO: How Neuron Dynamics Drive Toxicity Reduction Paper • 2411.06424 • Published Nov 10, 2024 • 5
Can sparse autoencoders be used to decompose and interpret steering vectors? Paper • 2411.08790 • Published Nov 13, 2024 • 8
Evaluating the role of `Constitutions' for learning from AI feedback Paper • 2411.10168 • Published Nov 15, 2024 • 5
WhisperX: Time-Accurate Speech Transcription of Long-Form Audio Paper • 2303.00747 • Published Mar 1, 2023 • 6
The VoxCeleb Speaker Recognition Challenge: A Retrospective Paper • 2408.14886 • Published Aug 27, 2024 • 11
TIM: A Time Interval Machine for Audio-Visual Action Recognition Paper • 2404.05559 • Published Apr 8, 2024
ACES: Automatic Cohort Extraction System for Event-Stream Datasets Paper • 2406.19653 • Published Jun 28, 2024
GREEN: Generative Radiology Report Evaluation and Error Notation Paper • 2405.03595 • Published May 6, 2024
Artificial Generational Intelligence: Cultural Accumulation in Reinforcement Learning Paper • 2406.00392 • Published Jun 1, 2024 • 14
Red Teaming GPT-4V: Are GPT-4V Safe Against Uni/Multi-Modal Jailbreak Attacks? Paper • 2404.03411 • Published Apr 4, 2024 • 10
Pangu-Agent: A Fine-Tunable Generalist Agent with Structured Reasoning Paper • 2312.14878 • Published Dec 22, 2023 • 15
When Do Prompting and Prefix-Tuning Work? A Theory of Capabilities and Limitations Paper • 2310.19698 • Published Oct 30, 2023
Efficient Online Reinforcement Learning with Offline Data Paper • 2302.02948 • Published Feb 6, 2023 • 2
Frontier AI Regulation: Managing Emerging Risks to Public Safety Paper • 2307.03718 • Published Jul 6, 2023 • 5
Evaluating Language Models for Mathematics through Interactions Paper • 2306.01694 • Published Jun 2, 2023 • 2