Stanford AI

university

https://www.ai.stanford.edu

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

awwkl submitted a paper about 1 month ago

Zero-shot World Models Are Developmentally Efficient Learners

qizhengz authored a paper about 1 month ago

Continuum: Efficient and Robust Multi-Turn LLM Agent Scheduling with KV Cache Time-to-Live

qizhengz authored a paper about 1 month ago

FrontierCS: Evolving Challenges for Evolving Intelligence

View all activity

Papers

Sparse Reward Subsystem in Large Language Models

Intelligence per Watt: Measuring Intelligence Efficiency of Local AI

View all Papers

awwkl

submitted a paper to Daily Papers about 1 month ago

Zero-shot World Models Are Developmentally Efficient Learners

Paper • 2604.10333 • Published Apr 11 • 7

efecelik

posted an update 3 months ago

Post

3117

The moment we've been waiting for — ACE-Step dropped their new model: Ace-Step 1.5 🎉
🔗 ACE-Step/Ace-Step1.5
And the best part? It's released under the MIT license.
We've already started integrating it into our project. Let's go 🚀

1 reply

Xkev

submitted a paper to Daily Papers 3 months ago

Sparse Reward Subsystem in Large Language Models

Paper • 2602.00986 • Published Feb 1 • 13

efecelik

posted an update 4 months ago

Post

1418

🎮 Introducing: Paper Popularity Game

Think you know which AI papers go viral? Test your instincts!
I built a little game where you try to guess the popularity of AI research papers from the Hugging Face Daily Papers feed.

How it works:
You'll see two papers side by side—read the titles, check the abstracts, and pick which one you think got more upvotes from the HF community.

It's a great way to discover trending AI research while having fun.
Tests your intuition about what the ML community finds interesting.

Try it out:
efecelik/paper-popularity-game
Would love to hear your high scores and feedback!

efecelik

posted an update 4 months ago

Post

1642

Interesting paper: PhysRVG

The core idea: instead of treating physics as a soft condition the model can work around during optimization, enforce it strictly via reinforcement learning. The paper focuses on rigid body dynamics - collisions, pendulums, free fall, rolling.

PhysRVG: Physics-Aware Unified Reinforcement Learning for Video Generative Models (2601.11087)

2 replies

efecelik

posted an update 4 months ago

Post

639

Having multiple perspectives helps me create more diverse, innovative projects but without deep mastery in one area, I never feel truly satisfied.

What's the better investment: going deep in one field, or staying broad across many?

2 replies

efecelik

posted an update 4 months ago

Post

2537

My First MCP Server: DataView
Browse HuggingFace datasets directly from your AI assistant.
-Search & filter datasets
-View rows & stats
-SQL queries & Parquet export
efecelik/dataview-mcp

efecelik

posted an update 4 months ago

Post

252

We Built a Music App with ACE-Step – Looking for Feedback

Hey everyone,

We've been building AceSteps – a platform where anyone can create music using the ACE-Step model ( ACE-Step/ACE-Step-v1-3.5B). You can mint your tracks as NFTs, tokenize them into 100,000 fractional shares, and trade them on Uniswap V4. When your song gets popular, token holders earn from ad revenue automatically. It's a Farcaster Mini-App on Base Network.

But we want to make it better, and we'd love your input:

What's the one feature that would make you actually use an AI music tool regularly?
Andd any suggestions on how we can make this model better? Actually sharing here for this question. 🤗

Any feedback, ideas, or critiques are welcome.
🔗 https://docs.acesteps.com/
🔗 https://docs.acesteps.com/pitch-deck.html
🔗 https://farcaster.xyz/?launchFrameUrl=https%3A%2F%2Fwww.acesteps.com%2F
🔗 https://www.acesteps.com

efecelik

posted an update 4 months ago

Post

2320

why ACE-Step model isn't popular that much? imo it makes really good music.
ACE-Step/ACE-Step-v1-3.5B

2 replies

KingNish

posted an update 5 months ago

Post

3638

Muon vs MuonClip vs Muon+Adamw

Muon has gone from an experiment to a mainstream optimizer, but does it hold up for fine‑tuning? We ran head‑to‑head tests on Qwen3‑4B (10k+ high‑quality instruction rows) to find out.

Short story: Pure Muon converged fastest at the start, but its gradient‑norm spikes made training unstable. MuonClip (Kimi K2’s clipping) stabilizes long pretraining runs, yet in our small‑scale fine‑tune it underperformed, lower token accuracy and slower convergence. The winner was the hybrid: Muon for 2D layers + AdamW for 1D layers. It delivered the best balance of stability and final performance and even beat vanilla AdamW.

Takeaway: for small-scale fine-tuning, hybrid = practical and reliable.

Next Step: scale to larger models/datasets to see if Muon’s spikes become catastrophic or if clipping wins out.

Full Blog Link: https://huggingface.co/blog/KingNish/optimizer-part1

KingNish

posted an update 5 months ago

Post

2812

I tested Muon vs MuonClip vs Muon+AdamW for fine-tuning LLMs
Just published a blog on that, Read here 👉 https://huggingface.co/blog/KingNish/optimizer-part1

1 reply

kzliu

authored a paper 9 months ago

UQ: Assessing Language Models on Unsolved Questions

Paper • 2508.17580 • Published Aug 25, 2025 • 15

KingNish

posted an update 10 months ago

Post

2233

Wan 2.2 fast upto 10x faster than original wan 2.2

Model: FastVideo/FastWan2.2-TI2V-5B-FullAttn-Diffusers

Space: KingNish/wan2-2-fast

KingNish

posted an update 11 months ago

Post

1229

What's currently the biggest gap in Open Source Datasets ??

5 replies

Kameshr

updated a dataset about 1 year ago

Stanford/Compiled_COT

Viewer • Updated Mar 15, 2025 • 2.23M • 209 • 2

Kameshr

published a dataset about 1 year ago

Stanford/Compiled_COT

Viewer • Updated Mar 15, 2025 • 2.23M • 209 • 2

not-lain

posted an update about 1 year ago

Post

7983

🚀AraClip is now fully integrated with Hugging Face 🤗

AraClip is a specialized CLIP model that was created by @pain and optimized for Arabic text-image retrieval tasks🔥

🔗 Try it out 🔗
🤖 model: Arabic-Clip/araclip
🧩 Gradio demo: Arabic-Clip/Araclip-Simplified
🌐 website: https://arabic-clip.github.io/Arabic-CLIP/

2 replies

not-lain

posted an update over 1 year ago

Post

4594

I have just released a new blogpost about kv caching and its role in inference speedup 🚀
🔗 https://huggingface.co/blog/not-lain/kv-caching/
some takeaways :

4 replies

not-lain

posted an update over 1 year ago

Post

1840

we now have more than 2000 public AI models using ModelHubMixin🤗

not-lain

posted an update over 1 year ago

Post

4183

Published a new blogpost 📖
In this blogpost I have gone through the transformers' architecture emphasizing how shapes propagate throughout each layer.
🔗 https://huggingface.co/blog/not-lain/tensor-dims
some interesting takeaways :

AI & ML interests

Recent Activity

Papers

Team members 443

Stanford's activity