LM Leaderboard — June 2026
Large language models ranked by LMSys Arena Elo, MMLU, HumanEval, MATH, pricing, and inference speed. Refreshed regularly with live data from official provider pricing pages, Artificial Analysis, and the Arena.
What is the top LM on the Arena right now?
LMArena (formerly LMSYS Chatbot Arena) tracks pairwise human votes across hundreds of thousands of conversations. Our June 2026 snapshot below ranks 350 language models on Arena Elo plus the standard MMLU / HumanEval / MATH benchmark suite. The Arena re-ranks roughly weekly as votes accumulate; what you see is the most recent snapshot verified against the public Arena and Artificial Analysis.
| # | Model | Quality | Arena ELO | Speed | Price | Context | Value | Released |
|---|---|---|---|---|---|---|---|---|
| 1 | OpenAI · Reasoning at any cost | 99 | 1510 | 68 t/s | $30 / $180 | 1M | 0.9 | Apr 2026 |
| 2 | OpenAI · Frontier general purpose | 98 | 1506 | 70 t/s | $5 / $30 | 1M | 5.6 | Apr 2026 |
| 3 | Anthropic · Coding & agentic workflows | 97 | 1505 | 68 t/s | $5 / $25 | 1M | 6.5 | Apr 2026 |
| 4 | OpenAI · Complex analysis | 97 | — | — | $30 / $180 | 1M | 0.9 | Mar 2026 |
| 5 | OpenAI · Complex analysis | 97 | — | — | $21 / $168 | 400K | 1.0 | Dec 2025 |
| 6 | Anthropic · Complex analysis | 97 | — | — | $30 / $150 | 1M | 1.1 | May 2026 |
| 7 | OpenAI · Deep research | 96 | — | — | $10 / $40 | 200K | 3.8 | Oct 2025 |
| 8 | OpenAI · Deep research | 96 | — | — | $2 / $8 | 200K | 19.2 | Oct 2025 |
| 9 | OpenAI · Hard reasoning | 96 | — | — | $20 / $80 | 200K | 1.9 | Jun 2025 |
| 10 | Google · Speed & cost | 96 | 1505 | — | $2 / $12 | 1M | 13.7 | Feb 2026 |
| 11 | Google · Speed & cost | 96 | 1505 | — | $2 / $12 | 1M | 13.7 | Feb 2026 |
| 12 | Anthropic · General purpose | 95 | 1490 | — | $5 / $25 | 1M | 6.3 | Feb 2026 |
| 13 | Anthropic · General purpose | 95 | — | — | $5 / $25 | 200K | 6.3 | Nov 2025 |
| 14 | Anthropic · Complex analysis | 95 | — | — | $30 / $150 | 1M | 1.1 | Apr 2026 |
| 15 | xAI: Grok 4.3 New xAI · Agentic tasks & real-time info | 94 | 1498 | 83 t/s | $1.25 / $2.5 | 1M | 50.1 | May 2026 |
| 16 | Google · Image generation | 94 | — | — | $2 / $12 | 66K | 13.4 | Nov 2025 |
| 17 | Anthropic · Multimodal | 94 | — | — | $15 / $75 | 200K | 2.1 | Aug 2025 |
| 18 | Anthropic · Multimodal | 94 | — | — | $15 / $75 | 200K | 2.1 | May 2025 |
| 19 | Moonshot AI · Frontier quality at low cost | 93 | 1466 | 48 t/s | $0.73 / $3.49 | 256K | 44.1 | Apr 2026 |
| 20 | OpenAI · General purpose | 93 | 1495 | — | $2.5 / $15 | 1M | 10.6 | Mar 2026 |
| 21 | OpenAI · General purpose | 93 | — | — | $1.75 / $14 | 128K | 11.8 | Mar 2026 |
| 22 | OpenAI · Code generation | 93 | — | — | $1.75 / $14 | 400K | 11.8 | Feb 2026 |
| 23 | OpenAI · Code generation | 93 | — | — | $1.75 / $14 | 400K | 11.8 | Jan 2026 |
| 24 | OpenAI · General purpose | 93 | — | — | $1.75 / $14 | 128K | 11.8 | Dec 2025 |
| 25 | OpenAI · General purpose | 93 | — | — | $1.75 / $14 | 400K | 11.8 | Dec 2025 |
| 26 | OpenAI · Code generation | 93 | — | — | $1.25 / $10 | 400K | 16.5 | Dec 2025 |
| 27 | OpenAI · General purpose | 93 | — | — | $1.25 / $10 | 400K | 16.5 | Nov 2025 |
| 28 | OpenAI · General purpose | 93 | — | — | $1.25 / $10 | 128K | 16.5 | Nov 2025 |
| 29 | OpenAI · Code generation | 93 | — | — | $1.25 / $10 | 400K | 16.5 | Nov 2025 |
| 30 | OpenAI · Hard reasoning | 93 | — | — | $150 / $600 | 200K | 0.2 | Mar 2025 |
| 31 | OpenAI · Complex analysis | 93 | — | — | $30 / $60 | 8K | 2.1 | May 2023 |
| 32 | OpenAI · Multimodal | 93 | — | — | $30 / $60 | 8K | 2.1 | May 2023 |
| 33 | xAI · General purpose | 93 | 1496 | — | $1.25 / $2.5 | 2M | 49.6 | Mar 2026 |
| 34 | OpenAI · Complex analysis | 93 | — | — | $8 / $15 | 272K | 8.1 | Apr 2026 |
| 35 | DeepSeek · Open-source value leader | 92 | 1467 | 33 t/s | $0.435 / $0.87 | 1M | 141.0 | Apr 2026 |
| 36 | OpenAI · Hard reasoning | 92 | — | — | $2 / $8 | 200K | 18.4 | Apr 2025 |
| 37 | · Hard reasoning | 91 | — | — | $0.3 / $1.1 | 164K | 130.0 | Jul 2025 |
| 38 | Google · Speed & cost | 91 | — | — | $1.25 / $10 | 1M | 16.2 | Jun 2025 |
| 39 | Google · Speed & cost | 91 | — | — | $1.25 / $10 | 1M | 16.2 | Jun 2025 |
| 40 | DeepSeek · Hard reasoning | 91 | — | — | $0.5 / $2.15 | 164K | 68.7 | May 2025 |
| 41 | Google · Speed & cost | 91 | — | — | $1.25 / $10 | 1M | 16.2 | May 2025 |
| 42 | DeepSeek · Hard reasoning | 91 | — | — | $0.29 / $0.29 | 33K | 313.8 | Jan 2025 |
| 43 | DeepSeek · Hard reasoning | 91 | — | — | $0.7 / $0.8 | 131K | 121.3 | Jan 2025 |
| 44 | DeepSeek: R1OSS DeepSeek · Hard reasoning | 91 | — | — | $0.7 / $2.5 | 64K | 56.9 | Jan 2025 |
| 45 | Anthropic · General purpose | 91 | 1467 | — | $3 / $15 | 1M | 10.1 | Feb 2026 |
| 46 | · Open-weight agentic & tool use | 90 | 1467 | 48 t/s | $0.98 / $3.08 | 200K | 44.3 | Apr 2026 |
| 47 | OpenAI · General purpose | 90 | 1455 | — | $1.25 / $10 | 400K | 16.0 | Aug 2025 |
| 48 | xAI · General purpose | 90 | — | — | $3 / $15 | 131K | 10.0 | Jun 2025 |
| 49 | xAI · General purpose | 90 | — | — | $3 / $15 | 131K | 10.0 | Apr 2025 |
| 50 | Alibaba Cloud · Open-source | 90 | — | — | $1.04 / $6.24 | 262K | 24.7 | Apr 2026 |
| 51 | Alibaba Cloud · Open-source | 90 | — | — | $2.5 / $7.5 | 1M | 18.0 | May 2026 |
| 52 | OpenAI · General purpose | 89 | — | — | $2 / $8 | 1M | 17.8 | Apr 2025 |
| 53 | Moonshot AI · Speed & cost | 89 | 1452 | — | $0.4 / $1.9 | 262K | 77.4 | Jan 2026 |
| 54 | OpenAI · Multimodal | 88 | — | — | $10 / $10 | 400K | 8.8 | Oct 2025 |
| 55 | OpenAI · Complex analysis | 88 | — | — | $15 / $120 | 400K | 1.3 | Oct 2025 |
| 56 | Anthropic · General purpose | 88 | — | — | $3 / $15 | 1M | 9.8 | Sep 2025 |
| 57 | OpenAI · General purpose | 88 | — | — | $2.5 / $10 | 128K | 14.1 | Aug 2025 |
| 58 | OpenAI · Search + citations | 88 | — | — | $2.5 / $10 | 128K | 14.1 | Mar 2025 |
| 59 | OpenAI · Hard reasoning | 88 | — | — | $15 / $60 | 200K | 2.3 | Dec 2024 |
| 60 | OpenAI · General purpose | 88 | — | — | $2.5 / $10 | 128K | 14.1 | Nov 2024 |
| 61 | OpenAI · General purpose | 88 | — | — | $2.5 / $10 | 128K | 14.1 | Aug 2024 |
| 62 | OpenAI · General purpose | 88 | — | — | $2.5 / $10 | 128K | 14.1 | May 2024 |
| 63 | OpenAI · Multimodal | 88 | — | — | $6 / $18 | 128K | 7.3 | May 2024 |
| 64 | OpenAI · General purpose | 88 | — | — | $5 / $15 | 128K | 8.8 | May 2024 |
| 65 | OpenAI · Multimodal | 88 | — | — | $10 / $30 | 128K | 4.4 | Apr 2024 |
| 66 | OpenAI · Complex analysis | 88 | — | — | $10 / $30 | 128K | 4.4 | Jan 2024 |
| 67 | OpenAI · Multimodal | 88 | — | — | $10 / $30 | 128K | 4.4 | Nov 2023 |
| 68 | Z.ai: GLM 5OSS · Open-source | 88 | 1450 | — | $0.6 / $1.92 | 80K | 69.8 | Feb 2026 |
| 69 | DeepSeek · Open-source | 87 | 1455 | — | $0.252 / $0.378 | 164K | 276.2 | Dec 2025 |
| 70 | · Open-source | 86 | — | — | $0.135 / $0.5 | 131K | 270.9 | Dec 2025 |
| 71 | DeepSeek · Open-source | 86 | — | — | $0.287 / $0.431 | 164K | 239.6 | Dec 2025 |
| 72 | DeepSeek · Open-source | 86 | — | — | $0.27 / $0.41 | 164K | 252.9 | Sep 2025 |
| 73 | DeepSeek · Open-source | 86 | — | — | $0.27 / $0.95 | 164K | 141.0 | Sep 2025 |
| 74 | DeepSeek · Open-source | 86 | — | — | $0.21 / $0.79 | 33K | 172.0 | Aug 2025 |
| 75 | Anthropic · General purpose | 86 | — | — | $3 / $15 | 200K | 9.6 | May 2025 |
| 76 | DeepSeek · Open-source | 86 | — | — | $0.2 / $0.77 | 164K | 177.3 | Mar 2025 |
| 77 | Anthropic · General purpose | 86 | — | — | $3 / $15 | 200K | 9.6 | Feb 2025 |
| 78 | Anthropic · Hard reasoning | 86 | — | — | $3 / $15 | 200K | 9.6 | Feb 2025 |
| 79 | DeepSeek · Open-source | 86 | — | — | $0.2288 / $0.9144 | 164K | 150.5 | Dec 2024 |
| 80 | DeepSeek · Cheap-and-fast cascade tier | 85 | 1410 | 105 t/s | $0.1 / $0.2 | 1M | 566.7 | Apr 2026 |
| 81 | Mistral AI · Open-source | 85 | — | — | $0.5 / $1.5 | 262K | 85.0 | Dec 2025 |
| 82 | Mistral AI · Open-source | 85 | — | — | $2 / $6 | 131K | 21.3 | Nov 2024 |
| 83 | Mistral AI · Open-source | 85 | — | — | $2 / $6 | 131K | 21.3 | Nov 2024 |
| 84 | Mistral AI · Open-source | 85 | — | — | $2 / $6 | 128K | 21.3 | Feb 2024 |
| 85 | Cohere · Open-source | 84 | — | — | $2.5 / $10 | 128K | 13.4 | Aug 2024 |
| 86 | Google · Speed & cost | 84 | — | — | $1.5 / $9 | 1M | 16.0 | May 2026 |
| 87 | OpenAI · Speed & cost | 83 | — | — | $0.75 / $4.5 | 400K | 31.6 | Mar 2026 |
| 88 | OpenAI · Speed & cost | 83 | — | — | $0.25 / $2 | 400K | 73.8 | Aug 2025 |
| 89 | Alibaba Cloud · Open-source | 82 | — | — | $0.04 / $0.15 | 256K | 863.2 | Mar 2026 |
| 90 | Alibaba Cloud · Open-source | 82 | — | — | $0.139 / $1 | 262K | 144.0 | Feb 2026 |
| 91 | Alibaba Cloud · Open-source | 82 | — | — | $0.195 / $1.56 | 262K | 93.4 | Feb 2026 |
| 92 | Alibaba Cloud · Open-source | 82 | — | — | $0.26 / $2.08 | 262K | 70.1 | Feb 2026 |
| 93 | Alibaba Cloud · Speed & cost | 82 | — | — | $0.065 / $0.26 | 1M | 504.6 | Feb 2026 |
| 94 | Alibaba Cloud · Open-source | 82 | — | — | $0.26 / $1.56 | 1M | 90.1 | Feb 2026 |
| 95 | Alibaba Cloud · Open-source | 82 | — | — | $0.39 / $2.34 | 262K | 60.1 | Feb 2026 |
| 96 | Alibaba Cloud · Hard reasoning | 82 | — | — | $0.78 / $3.9 | 262K | 35.0 | Feb 2026 |
| 97 | Alibaba Cloud · Code generation | 82 | — | — | $0.11 / $0.8 | 262K | 180.2 | Feb 2026 |
| 98 | Alibaba Cloud · Open-source | 82 | — | — | $0.104 / $0.416 | 131K | 315.4 | Oct 2025 |
| 99 | Alibaba Cloud · Hard reasoning | 82 | — | — | $0.117 / $1.365 | 131K | 110.7 | Oct 2025 |
| 100 | Alibaba Cloud · Open-source | 82 | — | — | $0.08 / $0.5 | 131K | 282.8 | Oct 2025 |
How the LLM leaderboard works
We pull official provider pricing every 24 hours, Artificial Analysis benchmark snapshots weekly, and LMSys Arena Elo as it publishes. The composite quality index is a 0-100 normalization over MMLU Pro, HumanEval, and MATH, weighted by recency and cross-validated against Arena Elo. We do not accept vendor-supplied numbers without an independent reference.
Where the leaderboard is wrong
No leaderboard predicts your production accuracy. LMSys Arena rewards style and short-conversation polish; a top-Arena model can still under-perform on your specific function-calling schema or long-context retrieval workload. Build an internal eval harness before you commit. See our LMArena Elo explained and LLM routing writeups for the deep-dive.
Related rankings
- AI Model Leaderboard — same data, broader entry point
- Models Leaderboard
- GenAI Leaderboard
- AI Vendor Lock-in Leaderboard