Accuracy is the reported evaluation metric for MVBench. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.
Higher is better
Muted rows were not state of the art when published — an earlier or same-year result already scored better.