Accuracy
1
Codex GPT-5.5 2026-04-23 OpenAI OpenAI
82.0%± 2.2
2
ForgeCode GPT-5.4 2026-03-12 ForgeCode OpenAI
81.8%± 2.0
3
TongAgents Gemini 3.1 Pro 2026-03-13 BIGAI Google
80.2%± 2.6
4
ForgeCode Claude Opus 4.6 2026-03-12 ForgeCode Anthropic
79.8%± 1.6
5
SageAgent GPT-5.3-Codex 2026-03-13 OpenSage OpenAI
78.4%± 2.2
6
ForgeCode Gemini 3.1 Pro 2026-03-02 ForgeCode Google
78.4%± 1.8
7
Droid GPT-5.3-Codex 2026-02-24 Factory OpenAI
77.3%± 2.2
8
Capy Claude Opus 4.6 2026-03-12 Capy Anthropic
75.3%± 2.4
9
Simple Codex GPT-5.3-Codex 2026-02-06 OpenAI OpenAI
75.1%± 2.4
10
Terminus-KIRA Gemini 3.1 Pro 2026-02-23 KRAFTON AI Google
74.8%± 2.6
======================================================
国产开源模型 kimi最高, 排名62
62
Terminus 2 Kimi K2.5 2026-02-04 AfterQuery Kimi
43.2%± 2.9

