RivianG (UygarUsta)

liked a model 9 days ago

google/translategemma-12b-it

Image-Text-to-Text • 13B • Updated 2 days ago • 60.2k • 235

liked a model 10 days ago

cyankiwi/Qwen3-30B-A3B-Instruct-2507-AWQ-8bit

Text Generation • 9B • Updated 17 days ago • 2.66k • 2

liked a model about 1 month ago

unsloth/Qwen-Image-Edit-2511-GGUF

Image-to-Image • 20B • Updated 22 days ago • 139k • 327

reacted to hesamation's post with ❤️ about 2 months ago

Post

2997

this is big... 50 AI researchers from Bytedance, Alibaba, Tencent, and other labs/universities just published a 300-page paper with surprising lessons about coding models and agents (data, pre and post-training, etc).

key highlights:

> small LLMs can beat proprietary giants
RL (RLVR specifically) gives small open-source models an edge over big models in reasoning. a 14B model trained with RLVR on high-quality verified problems can match the performance of OpenAI's o3.

> models have a hard time learning Python.
mixing language models during pre-training is good, but Python behaves different from statically typed languages. languages with similar syntax (Java and C#, or JavaScript and TypeScript) creates high positive synergy. mixing Python heavily into the training of statically typed languages can actually hurt because of Python's dynamic typing.

> not all languages are equal (coding scaling laws)
the amount of data required to specialize a model on a language drastically depends on the language. paper argues like C# and Java are easier to learn (less training data required). languages like Python and Javascript are actually more tricky to learn, ironically (you see AI most used for these languages :)

> MoE vs Dense (ability vs stability)
MoE models offer higher capacity, but are much more fragile during SFT than dense models. hyperparams in training have a more drastic effect in MoE models, while dense models are more stable. MoE models also require constant learning rate schedules to avoid routing instability.

> code models are "insecure" by default (duh)
training on public repos makes models learn years of accumulated insecure coding patterns. safety fine-tuning often fails to work much on code. a model might refuse to write a hate speech email but will happily generate a SQL-injection vulnerable function because it "works."

read the full paper:
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence (2511.18538)

1 reply

·

liked a model about 2 months ago

cyankiwi/Qwen3-30B-A3B-Instruct-2507-AWQ-4bit

Text Generation • 5B • Updated 17 days ago • 54.9k • 28

upvoted a collection about 2 months ago

Qwen AWQ & GPTQ

Collection

44 items • Updated Dec 22, 2025 • 9

liked a Space 2 months ago

Accurate GGUF VRAM Calculator

📊

51

Calculate VRAM for GGUF models using GPU layers and context

liked a model 2 months ago

nvidia/NVIDIA-Nemotron-Parse-v1.1

Image-Text-to-Text • Updated 2 days ago • 108k • 134

updated a model 2 months ago

RivianG/my_lora_bk

Text Generation • Updated Nov 17, 2025 • 12

published a model 2 months ago

RivianG/my_lora_bk

Text Generation • Updated Nov 17, 2025 • 12

reacted to nouamanetazi's post with 🤗 3 months ago

Post

4352

After training 𝐒𝐦𝐨𝐥𝐋𝐌𝟑 on 𝟑𝟖𝟒 𝐇𝟏𝟎𝟎𝐬 for nearly a month, I've come to realize something most people overlook: 𝐢𝐧𝐟𝐫𝐚𝐬𝐭𝐫𝐮𝐜𝐭𝐮𝐫𝐞 𝐢𝐬 𝐭𝐡𝐞 𝐦𝐚𝐤𝐞-𝐨𝐫-𝐛𝐫𝐞𝐚𝐤 𝐟𝐚𝐜𝐭𝐨𝐫 𝐢𝐧 𝐋𝐋𝐌 𝐭𝐫𝐚𝐢𝐧𝐢𝐧𝐠. 🔥

Everyone talks about model architecture and data quality. And yes, those matter immensely. But here's what nobody tells you: when your training run fails at 2 AM because of mysterious 𝐍𝐂𝐂𝐋 𝐞𝐫𝐫𝐨𝐫𝐬, or when your expensive GPU cluster is running at 𝟔𝟎% 𝐞𝐟𝐟𝐢𝐜𝐢𝐞𝐧𝐜𝐲, the problem isn't your model. It's most probably a 𝐦𝐢𝐬𝐮𝐬𝐞 𝐨𝐟 𝐭𝐡𝐞 𝐡𝐚𝐫𝐝𝐰𝐚𝐫𝐞. 🛠️

Questions that seemed simple but had no clear answers: Why is 𝐌𝐨𝐄 𝐭𝐫𝐚𝐢𝐧𝐢𝐧𝐠 𝐬𝐥𝐨𝐰𝐞𝐫 𝐭𝐡𝐚𝐧 𝐝𝐞𝐧𝐬𝐞 𝐦𝐨𝐝𝐞𝐥𝐬? Which 𝐍𝐂𝐂𝐋 𝐟𝐥𝐚𝐠𝐬 should we actually set? How often should we checkpoint without killing throughput?

That's why we built 𝐓𝐡𝐞 𝐒𝐦𝐨𝐥 𝐓𝐫𝐚𝐢𝐧𝐢𝐧𝐠 𝐏𝐥𝐚𝐲𝐛𝐨𝐨𝐤 📖: a complete guide covering everything from model architecture and data curation to the SmolLM3 training marathon, post-training techniques, and crucially, the 𝐢𝐧𝐟𝐫𝐚𝐬𝐭𝐫𝐮𝐜𝐭𝐮𝐫𝐞 𝐥𝐚𝐲𝐞𝐫 that most teams get wrong.

We validated real vs theoretical bandwidth across the entire stack: 𝐇𝐁𝐌𝟑 𝐡𝐢𝐭𝐭𝐢𝐧𝐠 𝟑 𝐓𝐁/𝐬, 𝐍𝐕𝐋𝐢𝐧𝐤 𝟒.𝟎 𝐫𝐞𝐚𝐜𝐡𝐢𝐧𝐠 𝟕𝟖𝟔 𝐆𝐁/𝐬, 𝐏𝐂𝐈𝐞 𝐆𝐞𝐧𝟒 𝐚𝐭 𝟏𝟒.𝟐 𝐆𝐁/𝐬. Then we ran collective operations across 𝟏𝟐𝟖 𝐆𝐏𝐔𝐬 (16 nodes, 8xH100s each) and measured how performance degrades at scale: all-reduce drops from 𝟒𝟖𝟎 𝐆𝐁/𝐬 on a single node to 𝟑𝟐𝟎-𝟑𝟓𝟎 𝐆𝐁/𝐬 across 16 nodes.

If you've ever wondered why your training runs are slower than they should be, or you're planning to scale up and want to avoid expensive mistakes, this guide might save you weeks of debugging.

𝐓𝐡𝐞 𝐒𝐦𝐨𝐥 𝐓𝐫𝐚𝐢𝐧𝐢𝐧𝐠 𝐏𝐥𝐚𝐲𝐛𝐨𝐨𝐤: https://lnkd.in/e5MKXUHS

Shared with ❤️ by the HuggingFace team

liked a model 3 months ago

deepseek-ai/DeepSeek-OCR

Image-Text-to-Text • 3B • Updated Nov 4, 2025 • 3.01M • 3.12k

liked 2 models 4 months ago

PaddlePaddle/PaddleOCR-VL

Image-Text-to-Text • 1.0B • Updated Dec 11, 2025 • 15.7k • 1.53k

unsloth/Qwen3-VL-8B-Instruct-bnb-4bit

Image-Text-to-Text • 9B • Updated Oct 14, 2025 • 8.75k • 3

reacted to sergiopaniego's post with 🔥 4 months ago

Post

1496

Super nice intro to fine-tuning with TRL, just dropped by @google (runs free on Colab)!

They use SFT + QLoRA to fine-tune the tiny Gemma 3 270M model for emoji generation

Here’s what the fine-tuned model generates for the prompt: “I'm learning to tweet” → 🐦🗣💻

Colab: https://colab.research.google.com/github/google-gemini/gemma-cookbook/blob/main/Demos/Emoji-Gemma-on-Web/resources/Fine_tune_Gemma_3_270M_for_emoji_generation.ipynb
Try it out: google/emoji-gemma
Learn more: https://developers.googleblog.com/en/own-your-ai-fine-tune-gemma-3-270m-for-on-device/

reacted to hba123's post with 🔥 4 months ago

Post

4062

🤖 What if building your own robot arm costs less than £220?

For years, robotics has been locked behind high prices and complex systems.
So we decided to change that.

Today, we’re open-sourcing Ark-Bot — a fully 3D-printed, 6-DOF robot arm that works seamlessly with our Python robotics library, Ark.

And yes… It’s only £215.86 to build.

🧠ArkBot Specs 🧠

1️⃣ Reach: 1 meter
2️⃣ Weight: 2.6 kg
3️⃣ Payload: 1.8 kg 💪
4️⃣ DOF: 6
5️⃣ Input Voltage: DC 12V

🤟Fully 3D-printable & open-source
🤟Integrated with Ark — no ROS required

📹 We’ve also released a video showing the full assembly process — because robotics should be something everyone can learn, build, and improve on.

👩‍🎓 With Ark-Bot, anyone — from students to AI researchers — can experiment with embodied AI, robot learning, and control algorithms on real hardware, affordably.

If you could control a 1-meter robot arm from your laptop for under £220…
👉 What would you build first?

🔗https://github.com/Robotics-Ark/ark_bot
🎥 https://www.youtube.com/watch?v=Kuk4pC0EaEw&feature=youtu.be

2 replies

·

reacted to sergiopaniego's post with 🔥 4 months ago

Post

3007

A few days ago, Thinking Machines Lab released “LoRA Without Regret”, showing that LoRA can match full fine-tuning performance when configured right.

Naturally, we decided to reproduce the results with TRL and release a guide!

https://huggingface.co/docs/trl/main/en/lora_without_regret

liked a model 4 months ago

chetwinlow1/Ovi

Image-to-Video • Updated Nov 15, 2025 • 278 • • 286

upvoted an article 4 months ago

Article

Smol2Operator: Post-Training GUI Agents for Computer Use

+3

Sep 23, 2025

•

135

liked a dataset 4 months ago

Hcompany/WebClick

Viewer • Updated Jun 9, 2025 • 1.64k • 4.82k • 73

UygarUsta

AI & ML interests

Recent Activity

Organizations

google/translategemma-12b-it

cyankiwi/Qwen3-30B-A3B-Instruct-2507-AWQ-8bit

unsloth/Qwen-Image-Edit-2511-GGUF

cyankiwi/Qwen3-30B-A3B-Instruct-2507-AWQ-4bit

Qwen AWQ & GPTQ

Accurate GGUF VRAM Calculator

nvidia/NVIDIA-Nemotron-Parse-v1.1

RivianG/my_lora_bk

RivianG/my_lora_bk

deepseek-ai/DeepSeek-OCR

PaddlePaddle/PaddleOCR-VL

unsloth/Qwen3-VL-8B-Instruct-bnb-4bit

chetwinlow1/Ovi

Smol2Operator: Post-Training GUI Agents for Computer Use

Hcompany/WebClick

UygarUsta

AI & ML interests

Recent Activity

Organizations

RivianG's activity

Accurate GGUF VRAM Calculator

Smol2Operator: Post-Training GUI Agents for Computer Use