Codesota · Models · LBCTsinghua University / Baidu1 results · 1 benchmarks
Model card

LBC.

Tsinghua University / Baiduopen-sourceLearnable Behavior Control (distributed off-policy actor-critic)

First to break 24 Atari human world records within 1B frames. ICLR 2023 Oral. Hybrid behavior mapping with bandit-based meta-controller.

§ 01 · Benchmarks

Every benchmark LBC has a recorded score for.

#BenchmarkArea · TaskMetricValueRankDateSource
01Atari 2600Reinforcement Learning · Atari Gameshuman-normalized-score10078.00#2/12source ↗
Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.
§ 02 · Strengths by area

Where LBC actually performs.

Reinforcement Learning
1
benchmark
avg rank #2.0
§ 05 · Sources & freshness

Where these numbers come from.

unknown
1
result
0 of 1 rows marked verified.