aloobun's picture

aloobun

aloobun

·

https://tinyurl.com/aloobun

AI & ML interests

tiny models and datasets, discord: aloobun

Recent Activity

liked a dataset 11 days ago

LLM360/TxT360-3efforts

liked a dataset 12 days ago

google/mobile-actions

liked a model 12 days ago

microsoft/TRELLIS.2-4B

View all activity

Organizations

upvoted a paper 25 days ago

Reasoning Under 1 Billion: Memory-Augmented Reinforcement Learning for Large Language Models

Paper • 2504.02273 • Published Apr 3, 2025 • 7

upvoted a paper about 1 month ago

ROOT: Robust Orthogonalized Optimizer for Neural Network Training

Paper • 2511.20626 • Published Nov 25, 2025 • 43

upvoted 2 papers 2 months ago

Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published Oct 30, 2025 • 119

Sparse Query Attention (SQA): A Computationally Efficient Attention Mechanism with Query Heads Reduction

Paper • 2510.01817 • Published Oct 2, 2025 • 15

upvoted a paper 4 months ago

UltraMemV2: Memory Networks Scaling to 120B Parameters with Superior Long-Context Learning

Paper • 2508.18756 • Published Aug 26, 2025 • 36

upvoted a collection 6 months ago

Gemma 3n

4 items • Updated Jul 10, 2025 • 255

upvoted a paper 7 months ago

QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23, 2025 • 88

upvoted a paper 8 months ago

Achieving Tokenizer Flexibility in Language Models through Heuristic Adaptation and Supertoken Learning

Paper • 2505.09738 • Published May 14, 2025 • 10

upvoted 2 collections 9 months ago

Cogito v1 Preview

5 items • Updated Apr 8, 2025 • 120

IndicTTS Datasets

Datasets derived from the Indic TTS Database, a special corpus of Indian languages developed by the Speech Technology Consortium at IIT Madras. • 13 items • Updated Mar 6, 2025 • 13

upvoted 2 papers about 1 year ago

Adapting Multilingual LLMs to Low-Resource Languages using Continued Pre-training and Synthetic Corpus

Paper • 2410.14815 • Published Oct 18, 2024 • 1

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 158

upvoted 3 collections about 1 year ago

ModernBERT

Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated Dec 19, 2024 • 152

Falcon3

Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters. • 40 items • Updated Nov 6, 2025 • 87

MiniPLM

Pre-trained models in MiniPLM: Knowledge Distillation for Pre-Training Language Models • 5 items • Updated Oct 21, 2024 • 2

upvoted 2 papers about 1 year ago

MiniPLM: Knowledge Distillation for Pre-Training Language Models

Paper • 2410.17215 • Published Oct 22, 2024 • 16

Structured 3D Latents for Scalable and Versatile 3D Generation

Paper • 2412.01506 • Published Dec 2, 2024 • 84

upvoted 2 collections about 1 year ago

InternVL2.5

Better than InternVL 2.0 • 19 items • Updated Sep 28, 2025 • 92

H2O Danube3

7 items • Updated Nov 30, 2024 • 57

upvoted a paper about 1 year ago

Cut Your Losses in Large-Vocabulary Language Models

Paper • 2411.09009 • Published Nov 13, 2024 • 49