Audio-understanding benchmark covering speech, music, environmental sound, and audio reasoning tasks for audio-language models.
Accuracy is the reported evaluation metric for MMAU. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.
Higher is better
| Rank | Model | Trust | Score | Year | Links | Edit |
|---|---|---|---|---|---|---|
| 01 | Qwen3.5-Omni-Plus | unverified | 82.2 | 2026 | Paper ↗ | Edit result |
| 02 | MiniCPM-o 4.5-Instruct | unverified | 76.9 | 2026 | Paper ↗Code ↗ | Edit result |
| 03 | LongCat-Flash-Omni | unverified | 75.9 | 2025 | Paper ↗Code ↗ | Edit result |
| 04 | Audio Flamingo 3 | unverified | 75.83 | 2025 | Paper ↗Code ↗ | Edit result |