LLM Leaderboard — June 2026
Large language models ranked by LMSys Arena Elo, MMLU, HumanEval, MATH, pricing, and inference speed. Refreshed regularly with live data from official provider pricing pages, Artificial Analysis, and the Arena.
What is "the best LLM" in June 2026?
The honest answer is "depends on the workload." For chat and general reasoning, the LMSys text Arena leader rotates monthly — the June 2026 snapshot below shows the current top model and its Elo. For coding-specific work, the LMSys coding Arena has its own leader. For value (quality-per-dollar), open-weight models under permissive licenses still win by a wide margin. The race at the top is tighter than at any point since the original GPT-4 launch — and switching costs are now the buyer's biggest risk, not capability.
| # | Model | Quality | Arena ELO | Speed | Price | Context | Value | Released |
|---|---|---|---|---|---|---|---|---|
| 1 | Anthropic · Frontier agentic coding & knowledge work | 100 | 1525 | 58 t/s | $10 / $50 | 1M | 3.3 | Jun 2026 |
| 2 | Anthropic · Coding, agents & computer use | 99 | 1512 | 72 t/s | $5 / $25 | 1M | 6.6 | May 2026 |
| 3 | OpenAI · Reasoning at any cost | 98 | 1510 | 68 t/s | $30 / $180 | 1M | 0.9 | Apr 2026 |
| 4 | OpenAI · Frontier general purpose | 97 | 1506 | 70 t/s | $5 / $30 | 1M | 5.5 | Apr 2026 |
| 5 | OpenAI · Complex analysis | 97 | — | — | $30 / $180 | 1M | 0.9 | Mar 2026 |
| 6 | OpenAI · Complex analysis | 97 | — | — | $21 / $168 | 400K | 1.0 | Dec 2025 |
| 7 | Anthropic · Complex analysis | 97 | — | — | $30 / $150 | 1M | 1.1 | May 2026 |
| 8 | Anthropic · Coding & agentic workflows | 96 | 1505 | 68 t/s | $5 / $25 | 1M | 6.4 | Apr 2026 |
| 9 | OpenAI · Deep research | 96 | — | — | $10 / $40 | 200K | 3.8 | Oct 2025 |
| 10 | OpenAI · Deep research | 96 | — | — | $2 / $8 | 200K | 19.2 | Oct 2025 |
| 11 | OpenAI · Hard reasoning | 96 | — | — | $20 / $80 | 200K | 1.9 | Jun 2025 |
| 12 | Google · Speed & cost | 96 | 1505 | — | $2 / $12 | 1M | 13.7 | Feb 2026 |
| 13 | Google · Science & long-context | 96 | 1505 | 131 t/s | $2 / $12 | 1M | 13.7 | Apr 2026 |
| 14 | Anthropic · General purpose | 95 | 1490 | — | $5 / $25 | 1M | 6.3 | Feb 2026 |
| 15 | Anthropic · General purpose | 95 | — | — | $5 / $25 | 200K | 6.3 | Nov 2025 |
| 16 | Anthropic · Complex analysis | 95 | — | — | $30 / $150 | 1M | 1.1 | Apr 2026 |
| 17 | Google · Image generation | 94 | — | — | $2 / $12 | 66K | 13.4 | Nov 2025 |
| 18 | Anthropic · Multimodal | 94 | — | — | $15 / $75 | 200K | 2.1 | Aug 2025 |
| 19 | OpenAI · Hard reasoning | 94 | 1370 | 68 t/s | $10 / $40 | 200K | 3.8 | Apr 2025 |
| 20 | Alibaba Cloud · Long autonomous agentic runs | 94 | 1488 | 90 t/s | $2.5 / $7.5 | 1M | 18.8 | May 2026 |
| 21 | xAI · Agentic tasks & real-time info | 93 | 1496 | 83 t/s | $1.25 / $2.5 | 1M | 49.6 | May 2026 |
| 22 | OpenAI · General purpose | 93 | 1495 | — | $2.5 / $15 | 1M | 10.6 | Mar 2026 |
| 23 | OpenAI · General purpose | 93 | — | — | $1.75 / $14 | 128K | 11.8 | Mar 2026 |
| 24 | OpenAI · Code generation | 93 | — | — | $1.75 / $14 | 400K | 11.8 | Feb 2026 |
| 25 | OpenAI · Code generation | 93 | — | — | $1.75 / $14 | 400K | 11.8 | Jan 2026 |
| 26 | OpenAI · General purpose | 93 | — | — | $1.75 / $14 | 128K | 11.8 | Dec 2025 |
| 27 | OpenAI · General purpose | 93 | — | — | $1.75 / $14 | 400K | 11.8 | Dec 2025 |
| 28 | OpenAI · Code generation | 93 | — | — | $1.25 / $10 | 400K | 16.5 | Dec 2025 |
| 29 | OpenAI · General purpose | 93 | — | — | $1.25 / $10 | 400K | 16.5 | Nov 2025 |
| 30 | OpenAI · General purpose | 93 | — | — | $1.25 / $10 | 128K | 16.5 | Nov 2025 |
| 31 | OpenAI · Code generation | 93 | — | — | $1.25 / $10 | 400K | 16.5 | Nov 2025 |
| 32 | OpenAI · Hard reasoning | 93 | — | — | $150 / $600 | 200K | 0.2 | Mar 2025 |
| 33 | OpenAI · Complex analysis | 93 | — | — | $30 / $60 | 8K | 2.1 | May 2023 |
| 34 | OpenAI · Multimodal | 93 | — | — | $30 / $60 | 8K | 2.1 | May 2023 |
| 35 | xAI · General purpose | 93 | 1496 | — | $1.25 / $2.5 | 2M | 49.6 | Mar 2026 |
| 36 | OpenAI · Complex analysis | 93 | — | — | $8 / $15 | 272K | 8.1 | Apr 2026 |
| 37 | Moonshot AI · Frontier quality at low cost | 92 | 1466 | 48 t/s | $0.73 / $3.49 | 256K | 43.6 | Apr 2026 |
| 38 | Google · Multimodal + value | 92 | 1345 | 87 t/s | $1.25 / $10 | 1M | 16.4 | Mar 2025 |
| 39 | Anthropic · Complex analysis | 91 | 1360 | 52 t/s | $15 / $75 | 200K | 2.0 | May 2025 |
| 40 | · Hard reasoning | 91 | — | — | $0.3 / $1.1 | 164K | 130.0 | Jul 2025 |
| 41 | Google · Speed & cost | 91 | — | — | $1.25 / $10 | 1M | 16.2 | Jun 2025 |
| 42 | DeepSeek · Hard reasoning | 91 | — | — | $0.5 / $2.15 | 164K | 68.7 | May 2025 |
| 43 | Google · Speed & cost | 91 | — | — | $1.25 / $10 | 1M | 16.2 | May 2025 |
| 44 | DeepSeek · Hard reasoning | 91 | — | — | $0.29 / $0.29 | 33K | 313.8 | Jan 2025 |
| 45 | DeepSeek · Hard reasoning | 91 | — | — | $0.7 / $0.8 | 131K | 121.3 | Jan 2025 |
| 46 | DeepSeek: R1OSS DeepSeek · Hard reasoning | 91 | — | — | $0.7 / $2.5 | 64K | 56.9 | Jan 2025 |
| 47 | DeepSeek · Open-source value leader | 90 | 1467 | 33 t/s | $1.74 / $3.48 | 1M | 34.5 | Apr 2026 |
| 48 | Anthropic · Coding & balance | 90 | 1467 | 73 t/s | $3 / $15 | 1M | 10.0 | Feb 2026 |
| 49 | OpenAI · General purpose | 90 | 1455 | — | $1.25 / $10 | 400K | 16.0 | Aug 2025 |
| 50 | xAI · General purpose | 90 | — | — | $3 / $15 | 131K | 10.0 | Apr 2025 |
| 51 | Alibaba Cloud · Open-source | 90 | — | — | $1.04 / $6.24 | 262K | 24.7 | Apr 2026 |
| 52 | OpenAI · Long context | 89 | 1310 | 120 t/s | $2 / $8 | 1M | 17.8 | Apr 2025 |
| 53 | Moonshot AI · Speed & cost | 89 | 1452 | — | $0.4 / $1.9 | 262K | 77.4 | Jan 2026 |
| 54 | · Open-weight agentic coding | 89 | 1455 | 80 t/s | $0.6 / $2.4 | 1M | 59.3 | Jun 2026 |
| 55 | · Open-weight agentic & tool use | 88 | 1467 | 48 t/s | $0.98 / $3.08 | 200K | 43.3 | Apr 2026 |
| 56 | OpenAI · Multimodal | 88 | — | — | $10 / $10 | 400K | 8.8 | Oct 2025 |
| 57 | OpenAI · Complex analysis | 88 | — | — | $15 / $120 | 400K | 1.3 | Oct 2025 |
| 58 | Anthropic · General purpose | 88 | — | — | $3 / $15 | 1M | 9.8 | Sep 2025 |
| 59 | OpenAI · General purpose | 88 | — | — | $2.5 / $10 | 128K | 14.1 | Aug 2025 |
| 60 | OpenAI · Search + citations | 88 | — | — | $2.5 / $10 | 128K | 14.1 | Mar 2025 |
| 61 | OpenAI · Hard reasoning | 88 | — | — | $15 / $60 | 200K | 2.3 | Dec 2024 |
| 62 | OpenAI · General purpose | 88 | — | — | $2.5 / $10 | 128K | 14.1 | Nov 2024 |
| 63 | OpenAI · General purpose | 88 | — | — | $2.5 / $10 | 128K | 14.1 | May 2024 |
| 64 | OpenAI · Multimodal | 88 | — | — | $6 / $18 | 128K | 7.3 | May 2024 |
| 65 | OpenAI · General purpose | 88 | — | — | $5 / $15 | 128K | 8.8 | May 2024 |
| 66 | OpenAI · Multimodal | 88 | — | — | $10 / $30 | 128K | 4.4 | Apr 2024 |
| 67 | OpenAI · Complex analysis | 88 | — | — | $10 / $30 | 128K | 4.4 | Jan 2024 |
| 68 | OpenAI · Multimodal | 88 | — | — | $10 / $30 | 128K | 4.4 | Nov 2023 |
| 69 | Z.ai: GLM 5OSS · Open-source | 88 | 1450 | — | $0.6 / $1.92 | 80K | 69.8 | Feb 2026 |
| 70 | Anthropic · Coding & balance | 88 | 1320 | 95 t/s | $3 / $15 | 200K | 9.8 | May 2025 |
| 71 | OpenAI · Reasoning & math | 88 | 1305 | 155 t/s | $1.1 / $4.4 | 200K | 32.0 | Jan 2025 |
| 72 | xAI · Real-time info | 87 | 1330 | 82 t/s | $3 / $15 | 131K | 9.7 | Feb 2025 |
| 73 | DeepSeek · Open-source | 87 | 1455 | — | $0.252 / $0.378 | 164K | 276.2 | Dec 2025 |
| 74 | · Open-source | 86 | — | — | $0.135 / $0.5 | 131K | 270.9 | Dec 2025 |
| 75 | DeepSeek · Open-source | 86 | — | — | $0.287 / $0.431 | 164K | 239.6 | Dec 2025 |
| 76 | DeepSeek · Open-source | 86 | — | — | $0.27 / $0.41 | 164K | 252.9 | Sep 2025 |
| 77 | DeepSeek · Open-source | 86 | — | — | $0.27 / $0.95 | 164K | 141.0 | Sep 2025 |
| 78 | DeepSeek · Open-source | 86 | — | — | $0.21 / $0.79 | 33K | 172.0 | Aug 2025 |
| 79 | DeepSeek · Open-source | 86 | — | — | $0.2 / $0.77 | 164K | 177.3 | Mar 2025 |
| 80 | Anthropic · General purpose | 86 | — | — | $3 / $15 | 200K | 9.6 | Feb 2025 |
| 81 | Anthropic · Hard reasoning | 86 | — | — | $3 / $15 | 200K | 9.6 | Feb 2025 |
| 82 | DeepSeek · Best open-source value | 86 | 1310 | 62 t/s | $0.27 / $1.1 | 128K | 125.5 | Mar 2025 |
| 83 | Alibaba Cloud · Multilingual & APAC | 86 | 1448 | 124 t/s | $1.4 / $5.6 | 256K | 24.6 | Apr 2026 |
| 84 | OpenAI · General purpose | 85 | 1285 | 109 t/s | $2.5 / $10 | 128K | 13.6 | May 2024 |
| 85 | Mistral AI · Open-source | 85 | — | — | $0.5 / $1.5 | 262K | 85.0 | Dec 2025 |
| 86 | Mistral AI · Open-source | 85 | — | — | $2 / $6 | 131K | 21.3 | Nov 2024 |
| 87 | Mistral AI · Open-source | 85 | — | — | $2 / $6 | 128K | 21.3 | Feb 2024 |
| 88 | Google · Speed & cost | 84 | — | — | $1.5 / $9 | 1M | 16.0 | May 2026 |
| 89 | OpenAI · Speed & cost | 83 | — | — | $0.75 / $4.5 | 400K | 31.6 | Mar 2026 |
| 90 | OpenAI · Speed & cost | 83 | — | — | $0.25 / $2 | 400K | 73.8 | Aug 2025 |
| 91 | Alibaba Cloud · Open-source | 82 | — | — | $0.04 / $0.15 | 256K | 863.2 | Mar 2026 |
| 92 | Alibaba Cloud · Open-source | 82 | — | — | $0.139 / $1 | 262K | 144.0 | Feb 2026 |
| 93 | Alibaba Cloud · Open-source | 82 | — | — | $0.195 / $1.56 | 262K | 93.4 | Feb 2026 |
| 94 | Alibaba Cloud · Open-source | 82 | — | — | $0.26 / $2.08 | 262K | 70.1 | Feb 2026 |
| 95 | Alibaba Cloud · Speed & cost | 82 | — | — | $0.065 / $0.26 | 1M | 504.6 | Feb 2026 |
| 96 | Alibaba Cloud · Open-source | 82 | — | — | $0.26 / $1.56 | 1M | 90.1 | Feb 2026 |
| 97 | Alibaba Cloud · Open-source | 82 | — | — | $0.39 / $2.34 | 262K | 60.1 | Feb 2026 |
| 98 | Alibaba Cloud · Hard reasoning | 82 | — | — | $0.78 / $3.9 | 262K | 35.0 | Feb 2026 |
| 99 | Alibaba Cloud · Code generation | 82 | — | — | $0.11 / $0.8 | 262K | 180.2 | Feb 2026 |
| 100 | Alibaba Cloud · Open-source | 82 | — | — | $0.104 / $0.416 | 131K | 315.4 | Oct 2025 |
How the LLM leaderboard works
We pull official provider pricing every 24 hours, Artificial Analysis benchmark snapshots weekly, and LMSys Arena Elo as it publishes. The composite quality index is a 0-100 normalization over MMLU Pro, HumanEval, and MATH, weighted by recency and cross-validated against Arena Elo. We do not accept vendor-supplied numbers without an independent reference.
Where the leaderboard is wrong
No leaderboard predicts your production accuracy. LMSys Arena rewards style and short-conversation polish; a top-Arena model can still under-perform on your specific function-calling schema or long-context retrieval workload. Build an internal eval harness before you commit. See our LMArena Elo explained and LLM routing writeups for the deep-dive.
Related rankings
- AI Model Leaderboard — same data, broader entry point
- Models Leaderboard
- GenAI Leaderboard
- AI Vendor Lock-in Leaderboard