Shane Caldwell's picture

1 3

Shane Caldwell PRO

SJCaldwell

·

https://hackbot.dad/

AI & ML interests

cybersecurity + ml

Recent Activity

authored a paper about 24 hours ago

AIRTBench: Measuring Autonomous AI Red Teaming Capabilities in Language Models

authored a paper about 24 hours ago

PentestJudge: Judging Agent Behavior Against Operational Requirements

liked a dataset 4 months ago

View all activity

Organizations

authored 2 papers about 24 hours ago

AIRTBench: Measuring Autonomous AI Red Teaming Capabilities in Language Models

Paper • 2506.14682 • Published Jun 17, 2025

PentestJudge: Judging Agent Behavior Against Operational Requirements

Paper • 2508.02921 • Published Aug 4, 2025

liked a dataset 4 months ago

allenai/c4

Viewer • Updated Jan 9, 2024 • 10.4B • 625k • 540

liked a model 10 months ago

Qwen/Qwen3-1.7B

Text Generation • 2B • Updated Jul 26, 2025 • 6.81M • • 430

upvoted a collection about 1 year ago

SYNTHETIC-1

A collection of tasks & verifiers for reasoning datasets • 9 items • Updated Oct 7, 2025 • 67

liked a model over 2 years ago

BAAI/bge-large-en

Feature Extraction • 0.3B • Updated Oct 12, 2023 • 226k • • 224