| 排名 ⇕ | 模型 ⇕ | |
|---|---|---|
|
›
🥇
|
GPT 5.5 (High)
OpenAI · Proprietary
|
9.22% ±1.29% |
|
›
🥈
|
Claude Opus 4.7 (Thinking)
Anthropic · Proprietary
|
8.26% ±1.21% |
|
›
🥉
|
Claude Opus 4.6
Anthropic · Proprietary
|
7.91% ±1.22% |
|
›
4
|
GPT 5.4 (High)
OpenAI · Proprietary
|
7.79% ±1.34% |
|
›
5
|
GPT 5.5
OpenAI · Proprietary
|
7.68% ±1.29% |
|
›
6
|
Claude Opus 4.7
Anthropic · Proprietary
|
6.48% ±1.25% |
|
›
7
|
Claude Sonnet 4.6
Anthropic · Proprietary
|
3.37% ±1.13% |
|
›
8
|
GLM 5.1
智谱 ZAI · MIT
|
1.87% ±1.39% |
|
›
9
|
DeepSeek V4 Pro
DeepSeek · MIT
|
0.36% ±1.39% |
|
›
10
|
Gemini 3.5 Flash
Google · Proprietary
|
0.39% ±1.24% |
|
›
11
|
Gemini 3.1 Pro Preview
Google · Proprietary
|
0.81% ±1.13% |
|
›
12
|
Kimi K2.6
月之暗面 · Modified MIT
|
1.15% ±1.26% |
|
›
13
|
DeepSeek V4 Flash
DeepSeek · MIT
|
1.43% ±1.61% |
|
›
14
|
Qwen 3.6 Plus
阿里巴巴 · Proprietary
|
4.01% ±1.43% |
|
›
15
|
Grok Build 0.1
xAI · Proprietary
|
5.31% ±1.26% |
|
›
16
|
Minimax M2.7
MiniMax · Modified MIT
|
8.39% ±1.24% |
|
›
17
|
Grok 4.3 (High)
xAI · Proprietary
|
9.45% ±2.22% |
|
›
18
|
Gemini 3 Flash
Google · Proprietary
|
9.47% ±1.23% |
|
›
19
|
Gemma 4 31B
Google · Apache 2.0
|
14.89% ±2.40% |
|
›
20
|
Grok 4.3
xAI · Proprietary
|
23.31% ±2.03% |
没有找到相关模型