Sailor2 Evaluation

community

AI & ML interests

None defined yet.

Recent Activity

Xalphinions submitted a paper 4 days ago

On the Role of Discreteness in Diffusion LLMs

dreamerdeo authored a paper about 2 months ago

Diffusion Language Models are Super Data Learners

dreamerdeo authored a paper about 2 months ago

Training Optimal Large Diffusion Language Models

View all activity

Xalphinions

submitted a paper to Daily Papers 4 days ago

On the Role of Discreteness in Diffusion LLMs

Paper • 2512.22630 • Published 9 days ago • 17

dreamerdeo

authored 2 papers about 2 months ago

Diffusion Language Models are Super Data Learners

Paper • 2511.03276 • Published Nov 5, 2025 • 128

Training Optimal Large Diffusion Language Models

Paper • 2510.03280 • Published Sep 28, 2025

SivilTaram

authored a paper about 2 months ago

Diffusion Language Models are Super Data Learners

Paper • 2511.03276 • Published Nov 5, 2025 • 128

kunato

authored 2 papers 3 months ago

Mangosteen: An Open Thai Corpus for Language Model Pretraining

Paper • 2507.14664 • Published Jul 19, 2025 • 7

Talk Less, Call Right: Enhancing Role-Play LLM Agents with Automatic Prompt Optimization and Role Prompting

Paper • 2509.00482 • Published Aug 30, 2025

SivilTaram

authored a paper 4 months ago

SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Paper • 2509.02479 • Published Sep 2, 2025 • 83

gabrielchua

authored a paper 5 months ago

Running in CIRCLE? A Simple Benchmark for LLM Code Interpreter Security

Paper • 2507.19399 • Published Jul 25, 2025 • 1

gabrielchua

authored a paper 6 months ago

LionGuard 2: Building Lightweight, Data-Efficient & Localised Multilingual Content Moderators

Paper • 2507.15339 • Published Jul 21, 2025

binwang

authored a paper 6 months ago

MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization

Paper • 2507.14683 • Published Jul 19, 2025 • 134

gabrielchua

authored a paper 6 months ago

Toxicity-Aware Few-Shot Prompting for Low-Resource Singlish Translation

Paper • 2507.11966 • Published Jul 16, 2025

SivilTaram

authored a paper 6 months ago

SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories?

Paper • 2507.12415 • Published Jul 16, 2025 • 42

kunato

authored 3 papers 6 months ago

Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs

Paper • 2502.12982 • Published Feb 18, 2025 • 19

Mind the Gap! Static and Interactive Evaluations of Large Audio Models

Paper • 2502.15919 • Published Feb 21, 2025 • 4

FinCoT: Grounding Chain-of-Thought in Expert Financial Reasoning

Paper • 2506.16123 • Published Jun 19, 2025 • 8

gabrielchua

authored a paper 6 months ago

Measuring What Matters: A Framework for Evaluating Safety Risks in Real-World LLM Applications

Paper • 2507.09820 • Published Jul 13, 2025

SivilTaram

authored a paper 6 months ago

First Return, Entropy-Eliciting Explore

Paper • 2507.07017 • Published Jul 9, 2025 • 23

gabrielchua

authored a paper 6 months ago

RabakBench: Scaling Human Annotations to Construct Localized Multilingual Safety Benchmarks for Low-Resource Languages

Paper • 2507.05980 • Published Jul 8, 2025 • 1

SivilTaram

authored a paper 6 months ago

ZeCO: Zero Communication Overhead Sequence Parallelism for Linear Attention

Paper • 2507.01004 • Published Jul 1, 2025 • 10

hynky

authored a paper 6 months ago

FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language

Paper • 2506.20920 • Published Jun 26, 2025 • 75