Codesota · Models · Llama 3 70BMeta11 results · 11 benchmarks
Model card

Llama 3 70B.

Metaopen-sourceLLM

Meta Llama 3, 70B parameter instruct variant. Released April 2024.

§ 01 · Benchmarks

Every benchmark Llama 3 70B has a recorded score for.

#BenchmarkArea · TaskMetricValueRankDateSource
01CommonsenseQAReasoning · Commonsense Reasoningaccuracy80.9%#3/3source ↗
02MAWPSReasoning · Arithmetic Reasoningaccuracy94.1%#3/3source ↗
03SVAMPReasoning · Arithmetic Reasoningaccuracy89.5%#3/3source ↗
04WinoGrandeReasoning · Commonsense Reasoningaccuracy85.3%#3/3source ↗
05HellaSwagReasoning · Commonsense Reasoningaccuracy88.0%#5/5source ↗
06CoNLL-2003Natural Language Processing · Named Entity Recognitionf189.3%#6/72024-07-31source ↗
07SNLINatural Language Processing · Natural Language Inferenceaccuracy89.7%#7/82024-07-31source ↗
08ARC-ChallengeReasoning · Commonsense Reasoningaccuracy93.0%#10/10source ↗
09SQuAD v2.0Natural Language Processing · Question Answeringf185.3%#20/222024-07-31source ↗
10GSM8KReasoning · Mathematical Reasoningaccuracy93.0%#23/32source ↗
11HumanEvalComputer Code · Code Generationpass@181.7%#34/42source ↗
Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.
§ 02 · Strengths by area

Where Llama 3 70B actually performs.

Reasoning
7
benchmarks
avg rank #7.1
Natural Language Processing
3
benchmarks
avg rank #11.0
Computer Code
1
benchmark
avg rank #34.0
§ 03 · Papers

1 paper with results for Llama 3 70B.

  1. 2024-07-31· Natural Language Processing· 3 results

    The Llama 3 Herd of Models

§ 04 · Related models

Other Meta models scored on Codesota.

DeiT-B Distilled
86M params · 2 results · 1 SOTA
Llama 3.1 405B
6 results
Llama-4-Maverick
400B total / 17B active (128 experts) params · 6 results
Llama 3.1 70B
4 results
Code Llama 34B
Unknown params · 2 results
ConvNeXt V2 Huge
650M params · 2 results
CodeLlama 70B
70B params · 1 result
ConvNeXt V2 Base
89M params · 1 result
§ 05 · Sources & freshness

Where these numbers come from.

meta-blog
7
results
arxiv
3
results
openai-simple-evals
1
result
3 of 11 rows marked verified.