marbert-complaint-sentiment
Fine-tuned UBC-NLP/MARBERTv2 for 3-class sentiment on a gold-standard Arabic complaint/review subset curated from the GLARE corpus (e-commerce / user-review domain; balanced classes, manual annotation).
(The Hub may still show “None dataset” in an auto-generated line from Trainer—that line is superseded by this description.)
Evaluation set (held-out):
- Loss: 0.5762
- Accuracy: 0.76
- Precision: 0.7625
- Recall: 0.76
- F1: 0.7593
Model description
- Task: Sentiment of short Arabic complaint-style text:
NEG(negative),NEU(neutral),POS(positive). - Label ids (should match
config.json):NEG→0,NEU→1,POS→2. - Base model: MARBERTv2 (multi-dialect Arabic BERT); cite Abdul-Mageed et al. (ACL 2020) for MARBERT.
- Companion paper & code: GitHub
YOUSEF-ysfxjo/complaint-xai-fl-research(manuscript:paper/research_v2.tex).
Intended uses & limitations
Uses: Triage or analytics for Arabic e-commerce complaints (Saudi/Gulf-style text, MSA + dialect + light code-mixing).
Limitations: Not for legal/moderation decisions without human review; optimized for this label schema and domain; max length 128 tokens in training (long texts truncated); performance may drop on other genres or dialects.
Training and evaluation data
- Source: GLARE (large-scale Arabic reviews; see Ghanbari et al., GLARE, arXiv:2412.15259).
- This checkpoint: Project gold sentiment split — 10,000 samples per class (30,000 total), balanced. Exact CSV column names match the training pipeline in the companion repository.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 64
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 300
- num_epochs: 5
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision | Recall | F1 |
|---|---|---|---|---|---|---|---|
| 0.5539 | 1.0 | 844 | 0.6033 | 0.7577 | 0.7601 | 0.7577 | 0.7574 |
| 0.5018 | 2.0 | 1688 | 0.5762 | 0.76 | 0.7625 | 0.76 | 0.7593 |
| 0.4266 | 3.0 | 2532 | 0.6210 | 0.756 | 0.7567 | 0.756 | 0.7551 |
| 0.3449 | 4.0 | 3376 | 0.6901 | 0.75 | 0.7532 | 0.75 | 0.7484 |
| 0.3056 | 5.0 | 4220 | 0.7335 | 0.749 | 0.7516 | 0.749 | 0.7479 |
Framework versions
- Transformers 4.53.3
- Pytorch 2.6.0+cu124
- Datasets 4.4.1
- Tokenizers 0.21.2
Inference example
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model_id = "Ysfxjo/marbert-complaint-sentiment"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)
text = "الشحن متأخر والتعامل سيء"
inputs = tok(text, return_tensors="pt", truncation=True, max_length=128)
with torch.no_grad():
pred = model(**inputs).logits.argmax(-1).item()
print(model.config.id2label[pred])
- Downloads last month
- 81
Model tree for Ysfxjo/marbert-complaint-sentiment
Base model
UBC-NLP/MARBERTv2