FineWeb-Edu Misinformation Classifier
A ModernBERT-base classifier trained to detect misinformation in web text, specifically content that passes educational quality filters despite being misleading or harmful. Trained on 200K documents from FineWeb-Edu annotated by Llama 4 Maverick (meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8).
Models
This repo contains two models:
Binary (binary/)
Classifies documents as misinfo or benign.
| Precision | Recall | F1 | Support | |
|---|---|---|---|---|
| misinfo | 0.83 | 0.89 | 0.86 | 3,885 |
| benign | 0.97 | 0.95 | 0.96 | 15,663 |
| accuracy | 0.94 | 19,548 |
Multiclass (multiclass/)
Classifies documents into 5 misinformation categories + benign.
| Precision | Recall | F1 | Support | |
|---|---|---|---|---|
| climate_denial | 0.79 | 0.91 | 0.84 | 539 |
| health_misinfo | 0.78 | 0.90 | 0.83 | 1,014 |
| pseudoscience | 0.82 | 0.86 | 0.84 | 1,618 |
| hate_extremism | 0.65 | 0.70 | 0.67 | 226 |
| conspiracy_propaganda | 0.55 | 0.74 | 0.63 | 488 |
| benign | 0.97 | 0.94 | 0.96 | 15,663 |
| accuracy | 0.92 | 19,548 |
Training details
- Base model: answerdotai/ModernBERT-base (149M parameters)
- Training data: 156,383 examples (from ratishsp/fineweb-edu-misinfo)
- Validation: 19,548 examples
- Test: 19,548 examples
- Epochs: 3
- Batch size: 8 per GPU, 8 GPUs (AMD MI250X on LUMI)
- Learning rate: 2e-5
- Warmup: 10% of total steps
- Weight decay: 0.01
- Max sequence length: 8,192 tokens
Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Binary model
tokenizer = AutoTokenizer.from_pretrained("ratishsp/fineweb-edu-misinfo-classifier", subfolder="binary")
model = AutoModelForSequenceClassification.from_pretrained("ratishsp/fineweb-edu-misinfo-classifier", subfolder="binary")
text = "Your document text here..."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=8192)
with torch.no_grad():
logits = model(**inputs).logits
prediction = torch.argmax(logits, dim=-1).item()
label = model.config.id2label[prediction]
print(label) # "misinfo" or "benign"
Limitations
- Annotations were produced by an LLM (Llama 4 Maverick), not human annotators. Inter-annotator agreement with Claude Sonnet 4.6 on 600 documents: binary kappa = 0.862, multiclass kappa = 0.842.
- The model was trained on content from known problematic domains and random FineWeb-Edu samples. It may not generalize well to misinformation styles not represented in the training data.
- The conspiracy_propaganda (F1 = 0.63) and hate_extremism (F1 = 0.67) categories have lower performance, likely due to less training data and more ambiguous boundaries.
Citation
@misc{puduppully2026fineweb-edu-misinfo,
author = {Puduppully, Ratish},
title = {FineWeb-Edu Misinformation Classifier},
year = {2026},
publisher = {HuggingFace},
url = {https://huggingface.co/ratishsp/fineweb-edu-misinfo-classifier}
}
Model tree for ratishsp/fineweb-edu-misinfo-classifier
Base model
answerdotai/ModernBERT-base