OpenAI compatible API. Attested gateway. Public status.
MoonshotAI: Kimi K2.6 Benchmarks
Benchmark and measurement links for MoonshotAI: Kimi K2.6, with TrustedRouter route data first.
1 URLbase_url migration
100smodels and routes
0prompt logs by default
moonshotai/kimi-k2.6
open weights
Benchmarks
AI IQ
IQ 117
#11 public AI IQ rank for kimi-k2.6
View AI IQ profile
Published benchmark scores
Benchmark scores for MoonshotAI: Kimi K2.6 — every row links to its source, and a score is only ever attached to the exact checkpoint it was measured on. Vendor model-card and open-leaderboard numbers are cited, not run by us. Rows marked TrustedRouter · replays published are our own runs of this model through the gateway, with the full per-item replay published in trustedrouter-benchmarks so anyone can re-grade them.
| Benchmark | Category | Score | Source |
|---|---|---|---|
| Aider Polyglot 34 Exercism exercises (Python), pass@1, real unit tests (no judge) |
Coding | 14.7% | TrustedRouter Benchmarks replay 2026-06-18 |
| SimpleQA Verified 250 closed-book questions, no tools; GPT-4.1 autorater (Google's exact prompt); 32768-token budget |
Factuality | 49.7% | TrustedRouter Benchmarks replay 2026-06-18 |
| MMLU-Pro 200-question stride-sampled subset (TIGER-Lab/MMLU-Pro), 10-choice CoT, letter-match; no judge |
Knowledge | 87.3% | TrustedRouter Benchmarks replay 2026-06-18 |
TrustedRouter measurements
TrustedRouter publishes route and status measurements without storing prompt or output content. Provider latency and uptime are exposed through the model performance and uptime pages.
External benchmark references
- TrustedRouter performance pageTrustedRouter measurement
- TrustedRouter uptime pageTrustedRouter measurement
- AI IQ profile · IQ 117Independent model IQ score
- Kimi API docsOfficial model information
- LMArena leaderboardIndependent benchmark index
- LiveBenchIndependent benchmark index
- Artificial Analysis modelsIndependent benchmark index
- HELMIndependent benchmark index