Model card
Llama 3.2 Vision 90B.
Metaopen-sourceUnknown paramsLlama 3.1 + cross-attention vision adapter
90B parameter vision-language model from Llama 3.2 family. September 2024. Source: Llama 3 paper arxiv:2407.21783.
§ 01 · Benchmarks
Every benchmark Llama 3.2 Vision 90B has a recorded score for.
| # | Benchmark | Area · Task | Metric | Value | Rank | Date | Source |
|---|---|---|---|---|---|---|---|
| 01 | TextVQA | Multimodal · Visual Question Answering | accuracy | 83.4% | #4 | 2024-07-31 | source ↗ |
| 02 | MMMU | Multimodal · Visual Question Answering | accuracy | 60.3% | #16 | 2024-07-31 | source ↗ |
Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.
§ 02 · Strengths by area
Where Llama 3.2 Vision 90B actually performs.
§ 03 · Papers
1 paper with results for Llama 3.2 Vision 90B.
- 2024-07-31· Natural Language Processing· 2 results
The Llama 3 Herd of Models
§ 04 · Related models
Other Meta models scored on Codesota.
§ 05 · Sources & freshness
Where these numbers come from.
arxiv
2
results
2 of 2 rows marked verified.