How to use from
Docker Model Runner
docker model run hf.co/ClassiCC-Corpus/Curio-1.1b
Quick Links

🐦 Curió 1.1B

πŸ“– Overview

CuriΓ³ 1.1B is a Portuguese-adapted language model created via continued pretraining of TinyLlama 1.1B (1T), originally trained on 1 trillion English tokens, on 150B Portuguese tokens from the ClassiCC-PT corpus.

This model was designed to explore the impact of language-specific corpora on adapting an English-trained base model to Portuguese, yielding performance improvements on Portuguese benchmarks without large-scale retraining from scratch.

πŸ— Training Setup

  • Base model: TinyLlama 1.1B (LLaMA-2 architecture)

  • Parameters: 1.1B

  • Continued pretraining tokens: 150B (ClassiCC-PT)

  • Sequence length: 4096 tokens (with packing)

  • Hardware: TPU v2-128 (thanks to Google TRC program)

  • Frameworks: T5X

πŸ“Š Evaluation

Evaluated on the Poeta benchmark β€” 14 diverse Portuguese tasks (RTE, STS, MCQ exams, sentiment analysis, QA, etc.) β€” using the Normalized Preferred Metric (NPM).

Model Training Regimen Poeta v2 NPM
TinyLlama 1T (EN) – 17.4
TinyLlama 2T (EN) +1T EN continued pretraining 20.9
training with mC4-PT +150B PT (mC4-PT) continued pretraining ~20
training with ClueWeb-22-PT +150B PT (Clueweb-22-PT) continued pretraining ~27
CuriΓ³ 1.1B +150B PT (ClassiCC-PT) continued pretraining 27.1

πŸ“₯ Usage

Please note that Curio 1.1B has not trained to be used as a chat model

from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "ClassiCC-Corpus/Curio-1.1B"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

πŸ“œ Citation

If you use CuriΓ³ 1.1B, please cite:

Coming soon

Acknowledgements

We thank the google TRC program, which generously granted us the necessary resources for the development of this research.

Downloads last month
11
Safetensors
Model size
1B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for ClassiCC-Corpus/Curio-1.1b

Quantizations
2 models

Collection including ClassiCC-Corpus/Curio-1.1b