Extended MBPP with additional test cases. Uses 399 hand-verified problems from MBPP-sanitized.
Pass@1 is the reported evaluation metric for MBPP+. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.
Higher is better