TinyGPT PT-BR v1

Modelo causal pequeno treinado em JAX/Flax/Orbax e exportado para o Hugging Face com safetensors.

Arquitetura

  • hidden_size: 768
  • num_hidden_layers: 12
  • num_attention_heads: 8
  • intermediate_size: 2048
  • max_position_embeddings: 1024
  • vocab_size: 32000

Origem

Checkpoint convertido do treino JAX TPU do projeto local. Este repositório usa trust_remote_code=True.

Uso

from transformers import AutoTokenizer, AutoModelForCausalLM

repo_id = "Madras1/tinygpt-ptbr-v1"
tokenizer = AutoTokenizer.from_pretrained(repo_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(repo_id, trust_remote_code=True)
Downloads last month
98
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train Madras1/tinygpt-ptbr-v1