T2I Models - a Stalin16 Collection

Stalin16 's Collections

Edu

Agents

Model Evaluation

Reasoning Models

Data and other things

Gen AI Diffusion

T2I Models

updated about 9 hours ago

yandex/stable-diffusion-3.5-medium-alchemist

Text-to-Image • Updated May 16, 2025 • 13 • 6
Ovis-U1 Technical Report

Paper • 2506.23044 • Published Jun 29, 2025 • 61
FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model

Paper • 2507.01953 • Published Jul 2, 2025 • 18
LongAnimation: Long Animation Generation with Dynamic Global-Local Memory

Paper • 2507.01945 • Published Jul 2, 2025 • 76
4KAgent: Agentic Any Image to 4K Super-Resolution

Paper • 2507.07105 • Published Jul 9, 2025 • 106
T-LoRA: Single Image Diffusion Model Customization Without Overfitting

Paper • 2507.05964 • Published Jul 8, 2025 • 120
LangSplatV2: High-dimensional 3D Language Gaussian Splatting with 450+ FPS

Paper • 2507.07136 • Published Jul 9, 2025 • 40
NoHumansRequired: Autonomous High-Quality Image Editing Triplet Mining

Paper • 2507.14119 • Published Jul 18, 2025 • 60
DesignLab: Designing Slides Through Iterative Detection and Correction

Paper • 2507.17202 • Published Jul 23, 2025 • 51
PUSA V1.0: Surpassing Wan-I2V with $500 Training Cost by Vectorized Timestep Adaptation

Paper • 2507.16116 • Published Jul 22, 2025 • 12
ARC-Hunyuan-Video-7B: Structured Video Comprehension of Real-World Shorts

Paper • 2507.20939 • Published Jul 28, 2025 • 57
X-Omni: Reinforcement Learning Makes Discrete Autoregressive Image Generative Models Great Again

Paper • 2507.22058 • Published Jul 29, 2025 • 40
Qwen-Image Technical Report

Paper • 2508.02324 • Published Aug 4, 2025 • 270
Omni-Effects: Unified and Spatially-Controllable Visual Effects Generation

Paper • 2508.07981 • Published Aug 11, 2025 • 63
NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale

Paper • 2508.10711 • Published Aug 14, 2025 • 145
Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning

Paper • 2508.20751 • Published Aug 28, 2025 • 89
Emu3.5: Native Multimodal Models are World Learners

Paper • 2510.26583 • Published Oct 30, 2025 • 109
OmniLayout: Enabling Coarse-to-Fine Learning with LLMs for Universal Document Layout Generation

Paper • 2510.26213 • Published Oct 30, 2025 • 10
Multimodal Spatial Reasoning in the Large Model Era: A Survey and Benchmarks

Paper • 2510.25760 • Published Oct 29, 2025 • 17
One Small Step in Latent, One Giant Leap for Pixels: Fast Latent Upscale Adapter for Your Diffusion Models

Paper • 2511.10629 • Published Nov 13, 2025 • 127
Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation

Paper • 2511.14993 • Published Nov 19, 2025 • 231
Back to Basics: Let Denoising Generative Models Denoise

Paper • 2511.13720 • Published Nov 17, 2025 • 69
Light-X: Generative 4D Video Rendering with Camera and Illumination Control

Paper • 2512.05115 • Published Dec 4, 2025 • 11
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance

Paper • 2512.08765 • Published Dec 9, 2025 • 132
Preserving Source Video Realism: High-Fidelity Face Swapping for Cinematic Quality

Paper • 2512.07951 • Published Dec 8, 2025 • 50
EgoEdit: Dataset, Real-Time Streaming Model, and Benchmark for Egocentric Video Editing

Paper • 2512.06065 • Published Dec 5, 2025 • 29
Towards Scalable Pre-training of Visual Tokenizers for Generation

Paper • 2512.13687 • Published Dec 15, 2025 • 105
Few-Step Distillation for Text-to-Image Generation: A Practical Guide

Paper • 2512.13006 • Published Dec 15, 2025 • 8
EgoX: Egocentric Video Generation from a Single Exocentric Video

Paper • 2512.08269 • Published Dec 9, 2025 • 119
OmniTransfer: All-in-one Framework for Spatio-temporal Video Transfer

Paper • 2601.14250 • Published 14 days ago • 46
PixelGen: Pixel Diffusion Beats Latent Diffusion with Perceptual Loss

Paper • 2602.02493 • Published 1 day ago • 29
UniReason 1.0: A Unified Reasoning Framework for World Knowledge Aligned Image Generation and Editing

Paper • 2602.02437 • Published 1 day ago • 68
Mind-Brush: Integrating Agentic Cognitive Search and Reasoning into Image Generation

Paper • 2602.01756 • Published 1 day ago • 21