view article Article ๐๏ธ Smol AI WorldCup: A 5-Axis Benchmark That Reveals What Small Language Models Can Really Do 19 days ago โข 38
view article Article Structural Problems in AI Benchmarking and the Case for a Unified Evaluation Framework 21 days ago โข 12
view article Article MARL: Runtime Middleware That Reduces LLM Hallucination Without Fine-Tuning 20 days ago โข 15
view article Article Do Bubbles Form When Tens of Thousands of AIs Simulate Capitalism? Feb 24 โข 17