One-Step Diffusion Transformer for Controllable Real-World Image Super-Resolution Paper • 2511.17138 • Published Nov 21, 2025 • 2
LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale Paper • 2504.16030 • Published Apr 22, 2025 • 37
AuraSR Collection Fastest super resolution model for AI generated images • 2 items • Updated Jul 30, 2024 • 7
VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning Paper • 2509.24650 • Published Sep 29, 2025 • 6
OmniVoice: Towards Omnilingual Zero-Shot Text-to-Speech with Diffusion Language Models Paper • 2604.00688 • Published 9 days ago • 7
view article Article Introducing Cohere-transcribe: state-of-the-art speech recognition 15 days ago • 35
Canary-1B-v2 & Parakeet-TDT-0.6B-v3: Efficient and High-Performance Models for Multilingual ASR and AST Paper • 2509.14128 • Published Sep 17, 2025 • 2
Demucs MLX — Music Source Separation Collection Demucs music stem separation for Apple Silicon. Float32 and float16 variants. • 2 items • Updated 24 days ago • 1
Granite Speech Models Collection Multilingual ASR and speech-to-text (STT) models for enterprise transcription and translation. • 6 items • Updated 9 days ago • 24
DeepFilterNet-MLX Collection MLX ports of the DeepFilterNet speech enhancement models for Apple Silicon • 7 items • Updated 27 days ago • 1
DeepFilterNet: A Low Complexity Speech Enhancement Framework for Full-Band Audio based on Deep Filtering Paper • 2110.05588 • Published Oct 11, 2021 • 1
DeepFilterNet2: Towards Real-Time Speech Enhancement on Embedded Devices for Full-Band Audio Paper • 2205.05474 • Published May 11, 2022 • 1
DeepFilterNet: Perceptually Motivated Real-Time Speech Enhancement Paper • 2305.08227 • Published May 14, 2023 • 2
Sasha: Creative Goal-Oriented Reasoning in Smart Homes with Large Language Models Paper • 2305.09802 • Published May 16, 2023 • 1