Model card
CQL (Conservative Q-Learning).
UC Berkeleyopen-sourceConservative Q-Learning — adds a regularizer to Q-values to penalize out-of-distribution actions
Kumar et al. NeurIPS 2020. One of the most widely cited offline RL baselines.
§ 01 · Benchmarks
No recorded benchmark results yet.
This model is in the registry but doesn’t have any benchmark_results rows yet. If you have a score, submit it →
Rank column shows this model’s position vs all other models scored on the same benchmark + metric (competitors after the slash). #1 in red means current SOTA. Sorted by rank, then newest result.
§ 04 · Related models