CEIA Reinforcement Learning

university

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

luanagbmartins updated a model about 1 hour ago

CEIA-RL/qwen3-4b-dw-lr-hf-dpo

luanagbmartins updated a model 3 days ago

CEIA-RL/qwen3-4b-dw-lr-dpo-offline

luanagbmartins published a model 4 days ago

CEIA-RL/qwen3-4b-dw-lr-dpo-offline

View all activity

luanagbmartins

updated a model about 1 hour ago

CEIA-RL/qwen3-4b-dw-lr-hf-dpo

Text Generation • 4B • Updated about 1 hour ago • 1.02k

luanagbmartins

updated a model 3 days ago

CEIA-RL/qwen3-4b-dw-lr-dpo-offline

Text Generation • 4B • Updated 3 days ago • 536

luanagbmartins

published a model 4 days ago

CEIA-RL/qwen3-4b-dw-lr-dpo-offline

Text Generation • 4B • Updated 3 days ago • 536

luanagbmartins

updated a dataset 4 days ago

CEIA-RL/Safety-Questions-Energy

Viewer • Updated 4 days ago • 4.89k • 24

luanagbmartins

published a dataset 4 days ago

CEIA-RL/Safety-Questions-Energy

Viewer • Updated 4 days ago • 4.89k • 24

luanagbmartins

updated a dataset 10 days ago

CEIA-RL/synth_regulacao_eng_qa_v0

Viewer • Updated 10 days ago • 2.32k • 29

luanagbmartins

published a dataset 10 days ago

CEIA-RL/synth_regulacao_eng_qa_v0

Viewer • Updated 10 days ago • 2.32k • 29

luanagbmartins

updated a dataset 10 days ago

CEIA-RL/QA-Energy

Viewer • Updated 10 days ago • 43 • 38

luanagbmartins

published a dataset 10 days ago

CEIA-RL/QA-Energy

Viewer • Updated 10 days ago • 43 • 38

luanagbmartins

published a model 10 days ago

CEIA-RL/qwen3-4b-dw-lr-hf-dpo

Text Generation • 4B • Updated about 1 hour ago • 1.02k

luanagbmartins

updated a dataset 10 days ago

CEIA-RL/Nemotron-SFT-Safety-pt-BR-Cleaned

Viewer • Updated 10 days ago • 45.1k • 49

luanagbmartins

published a dataset 10 days ago

CEIA-RL/Nemotron-SFT-Safety-pt-BR-Cleaned

Viewer • Updated 10 days ago • 45.1k • 49

luanagbmartins

updated a dataset 11 days ago

CEIA-RL/hh-rlhf-harmless-base-pt-BR

Viewer • Updated 11 days ago • 44.8k • 35

luanagbmartins

published a dataset 11 days ago

CEIA-RL/hh-rlhf-harmless-base-pt-BR

Viewer • Updated 11 days ago • 44.8k • 35

luanagbmartins

updated a dataset about 1 month ago

CEIA-RL/energy_prompts

Viewer • Updated Feb 27 • 1.56M • 141

luanagbmartins

published a dataset about 1 month ago

CEIA-RL/energy_prompts

Viewer • Updated Feb 27 • 1.56M • 141

luanagbmartins

updated a Space over 1 year ago

LLMasJudgeEval

🥇

luanagbmartins

updated 2 datasets over 1 year ago

CEIA-RL/judge_results

Viewer • Updated Oct 3, 2024 • 10 • 6

CEIA-RL/judge_requests

Viewer • Updated Sep 27, 2024 • 10 • 6

luckeciano

authored a paper over 2 years ago

Transformers are Meta-Reinforcement Learners

Paper • 2206.06614 • Published Jun 14, 2022

AI & ML interests

Recent Activity

Team members 5

CEIA-RL's activity

LLMasJudgeEval