Model card
InternVL3-78B.
Shanghai AI Labopen-source78B paramsVision-Language Model
§ 02 · Benchmarks
Every benchmark InternVL3-78B has a recorded score for.
| # | Benchmark | Area · Task | Metric | Value | Rank | Date | Source |
|---|---|---|---|---|---|---|---|
| 01 | MMBench | Multimodal · Visual Question Answering | accuracy | 90.1% | #3 | 2025-01-22 | source ↗ |
| 02 | MME-VideoOCR | Computer Vision · General OCR Capabilities | total-accuracy | 67.2% | #3 | — | source ↗ |
| 03 | MMMU | Multimodal · Visual Question Answering | accuracy | 73.3% | #10 | 2025-01-22 | |
| 04 | MMMU | Multimodal · Visual Question Answering | accuracy | 72.2% | #11 | — | source ↗ |
| 05 | MMMU | Multimodal · Image-Text-to-Text | accuracy | 72.2% | #13 | — | source ↗ |
| 06 | Video-MME | Multimodal · Video Understanding | accuracy | 72.7% | #14 | — | source ↗ |
Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.
§ 03 · Strengths by area
Where InternVL3-78B actually performs.
§ 04 · Papers
2 papers with results for InternVL3-78B.
- 2025-04-14· 3 results
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
- 2025-01-22· Multimodal· 2 results
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
§ 05 · Related models
Other Shanghai AI Lab models scored on Codesota.
§ 06 · Sources & freshness
Where these numbers come from.
pwc-dump
3
results
arxiv
2
results
alphaxiv-leaderboard
1
result
1 of 6 rows marked verified.