PhysicianBench is a state-of-the-art machine learning benchmark indexed on Codesota. This page tracks published model results, top scores per metric, and the SOTA timeline for PhysicianBench.
Pass@1 is the reported evaluation metric for PhysicianBench. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.
Higher is better
| Rank | Model | Trust | Score | Year | Links | Edit |
|---|---|---|---|---|---|---|
| 01 | GPT-5.5 | verified | 46.3 | N/A | Source ↗ | Edit result |
| 02 | DeepSeek V4-Pro | verified | 18.7 | N/A | Source ↗ | Edit result |