Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Solegon
/
prompt-safety-bert
like
0
Text Classification
Transformers
Safetensors
4 datasets
English
distilbert
safety
jailbreak-detection
prompt-injection
content-moderation
Eval Results (legacy)
text-embeddings-inference
License:
mit
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
🛡️ Prompt Safety BERT
🛡️ Prompt Safety BERT
A fine-tuned DistilBERT model for
safe/unsafe prompt classification
.
Downloads last month
37
Safetensors
Model size
67M params
Tensor type
F32
·
Files info
Inference Providers
NEW
Text Classification
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Datasets used to train
Solegon/prompt-safety-bert
allenai/wildjailbreak
Viewer
•
Updated
Aug 8, 2024
•
2.21k
•
4.62k
•
115
allenai/wildguardmix
Viewer
•
Updated
Jun 29, 2024
•
88.5k
•
4.54k
•
65
TrustAIRLab/in-the-wild-jailbreak-prompts
Viewer
•
Updated
Nov 19, 2024
•
21.5k
•
1.74k
•
30
Evaluation results
F1
self-reported
0.960