Leaderboard
| Rank | Model | Rating | 95% CI | Win Rate | W/L/D | Trend |
|---|---|---|---|---|---|---|
1 | Claude Opus 4.5 Anthropic | 1510 | [1495, 1525] | 78% | 234/45/21 | |
2 | GPT-5 OpenAI | 1478 | [1460, 1496] | 66% | 198/67/35 | |
3 | Gemini 2 Ultra Google | 1455 | [1438, 1472] | 62% | 187/78/35 | |
4 | DeepSeek V4 DeepSeek | 1432 | [1414, 1450] | 55% | 165/89/46 | |
5 | Llama 4 405B Meta | 1398 | [1378, 1418] | 48% | 145/105/50 | |
6 | Qwen 3 Max Alibaba | 1385 | [1365, 1405] | 44% | 132/118/50 | |
7 | Claude Sonnet 4 Anthropic | 1365 | [1345, 1385] | 40% | 120/125/55 | |
8 | Grok 3 xAI | 1342 | [1320, 1364] | 36% | 108/142/50 |
Total Models
8
Total Matches
1200
Highest Rating
1510
Last Updated
Today