MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows

MeanAudio is a novel text-to-audio generator that uses MeanFlow to synthesize realistic and faithful audio in few sampling steps. It achieves state-of-the-art performance in single-step audio generation and delivers strong performance in multi-step audio generation.

๐Ÿ“– Arxiv | ๐Ÿ’ป GitHub | ๐Ÿค— Model | ๐Ÿš€ Space | ๐ŸŒ Project Page

1 30
1 10
1 25
Model Variant
1 100
Examples
Prompt Duration Guidance Scale Sampling Steps Model Variant Seed