排名 模型
🥇
GPT 5.5 (High) OpenAI · Proprietary
9.22% ±1.29%
🥈
Claude Opus 4.7 (Thinking) Anthropic · Proprietary
8.26% ±1.21%
🥉
Claude Opus 4.6 Anthropic · Proprietary
7.91% ±1.22%
4
GPT 5.4 (High) OpenAI · Proprietary
7.79% ±1.34%
5
GPT 5.5 OpenAI · Proprietary
7.68% ±1.29%
6
Claude Opus 4.7 Anthropic · Proprietary
6.48% ±1.25%
7
Claude Sonnet 4.6 Anthropic · Proprietary
3.37% ±1.13%
8
GLM 5.1 智谱 ZAI · MIT
1.87% ±1.39%
9
DeepSeek V4 Pro DeepSeek · MIT
0.36% ±1.39%
10
Gemini 3.5 Flash Google · Proprietary
0.39% ±1.24%
11
Gemini 3.1 Pro Preview Google · Proprietary
0.81% ±1.13%
12
Kimi K2.6 月之暗面 · Modified MIT
1.15% ±1.26%
13
DeepSeek V4 Flash DeepSeek · MIT
1.43% ±1.61%
14
Qwen 3.6 Plus 阿里巴巴 · Proprietary
4.01% ±1.43%
15
Grok Build 0.1 xAI · Proprietary
5.31% ±1.26%
16
Minimax M2.7 MiniMax · Modified MIT
8.39% ±1.24%
17
Grok 4.3 (High) xAI · Proprietary
9.45% ±2.22%
18
Gemini 3 Flash Google · Proprietary
9.47% ±1.23%
19
Gemma 4 31B Google · Apache 2.0
14.89% ±2.40%
20
Grok 4.3 xAI · Proprietary
23.31% ±2.03%

没有找到相关模型