Recent studyBlind TTS Elo is live. Compare two anonymous voice samples, vote after listening, and help separate real preference signal from noise.Vote in the study ->
Codesota · Models · Phi-4 Multimodal Instruct9 results · 8 benchmarks
Model card

Phi-4 Multimodal Instruct.

unknown
§ 02 · Benchmarks

Every benchmark Phi-4 Multimodal Instruct has a recorded score for.

#BenchmarkArea · TaskMetricValueRankDateSource
01LibriSpeechSpeech · Speech Recognitionwer3.8%#41/92source ↗
02Open ASR LeaderboardSpeech · Speech Recognitionrtfx151.10#41/50source ↗
03SPGISpeechSpeech · Speech Recognitionwer3.1%#42/56source ↗
04AMI-IHMSpeech · Speech Recognitionwer11.1%#45/50source ↗
05GigaSpeechSpeech · Speech Recognitionwer9.3%#46/47source ↗
06TED-LIUMSpeech · Speech Recognitionwer2.9%#46/50source ↗
07Open ASR LeaderboardSpeech · Speech Recognitionwer6.0%#47/52source ↗
08Earnings-22Speech · Speech Recognitionwer10.2%#48/50source ↗
09VoxPopuliSpeech · Speech Recognitionwer6.0%#48/55source ↗
Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.
§ 03 · Strengths by area

Where Phi-4 Multimodal Instruct actually performs.

Speech
8
benchmarks
avg rank #44.9
§ 04 · Papers

1 paper with results for Phi-4 Multimodal Instruct.

  1. 2025-03-03· 9 results

    Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

§ 06 · Sources & freshness

Where these numbers come from.

pwc-dump
9
results
0 of 9 rows marked verified.