Model card
InternVL3-78B.
Shanghai AI Labopen-source78B paramsVision-Language Model
§ 01 · Benchmarks
Every benchmark InternVL3-78B has a recorded score for.
| # | Benchmark | Area · Task | Metric | Value | Rank | Date | Source |
|---|---|---|---|---|---|---|---|
| 01 | MMBench | Multimodal · Visual Question Answering | accuracy | 90.1% | #2 | 2025-01-22 | source ↗ |
| 02 | MME-VideoOCR | Computer Vision · General OCR Capabilities | total-accuracy | 67.2% | #3 | — | source ↗ |
| 03 | MMMU | Multimodal · Visual Question Answering | accuracy | 73.3% | #8 | 2025-01-22 |
Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.
§ 02 · Strengths by area
Where InternVL3-78B actually performs.
§ 03 · Papers
1 paper with results for InternVL3-78B.
- 2025-01-22· Multimodal· 2 results
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
§ 04 · Related models
Other Shanghai AI Lab models scored on Codesota.
§ 05 · Sources & freshness
Where these numbers come from.
arxiv
2
results
alphaxiv-leaderboard
1
result
1 of 3 rows marked verified.