Measured performance
Provider & model performance
Measured time-to-first-token, time-to-first-byte, throughput, and uptime for every LLM provider and model TrustedRouter routes to — continuously sampled, not vendor-claimed.
Last updated
2026-07-03T15:03:44Z
Continuously sampled from TrustedRouter's monitor regions over the 5,000-sample benchmark set — time-to-first-token (TTFT),
time-to-first-byte (TTFB), throughput, and success rate measured on real streaming
requests, not vendor-claimed. Unsupported route and probe-configuration rows are
reported separately and do not count as provider downtime. No prompt or output
content is ever stored.
Providers
Ranked by measured p50 time-to-first-token across all of a provider's models in the 5,000-sample benchmark set (27 providers · 709 samples).
| # | Provider | Models | p50 TTFT | Throughput | Uptime | Errors | Config excluded | Samples |
|---|---|---|---|---|---|---|---|---|
| 1 | deepinfra | 8 | 1538 ms | — | 100.00% | — | — | 25 |
| 2 | friendli | 8 | 1916 ms | — | 91.67% | provider_error 8% |
— | 24 |
| 3 | parasail | 13 | 1953 ms | — | 95.00% | provider_error 5% |
— | 20 |
| 4 | venice | 10 | 2179 ms | — | 96.30% | ReadTimeout 4% |
— | 27 |
| 5 | kimi | 2 | 2787 ms | — | 100.00% | — | — | 32 |
| 6 | gemini | 6 | 2834 ms | — | 100.00% | — | — | 26 |
| 7 | minimax | 6 | 2904 ms | — | 96.97% | ReadTimeout 3% |
— | 33 |
| 8 | siliconflow | 7 | 2935 ms | — | 91.30% | provider_error 9% |
— | 23 |
| 9 | phala | 12 | 3367 ms | — | 84.21% | provider_error 16% |
— | 19 |
| 10 | zai | 11 | 3562 ms | — | 100.00% | — | — | 31 |
| 11 | together | 4 | 3744 ms | — | 100.00% | — | — | 22 |
| 12 | baseten | 10 | 4099 ms | — | 100.00% | — | — | 25 |
| 13 | tinfoil | 6 | 5446 ms | — | 96.77% | ReadTimeout 3% |
— | 31 |
| 14 | grok | 2 | 5728 ms | — | 100.00% | — | — | 23 |
| 15 | openai | 12 | 5830 ms | — | 100.00% | — | — | 31 |
| 16 | mistral | 8 | 5929 ms | — | 100.00% | — | — | 35 |
| 17 | nebius | 11 | 5992 ms | — | 100.00% | — | — | 18 |
| 18 | crusoe | 12 | 6129 ms | — | 83.33% | provider_error 17% |
2 probe_config_error |
24 |
| 19 | cerebras | 4 | 6282 ms | — | 100.00% | — | — | 24 |
| 20 | gmi | 4 | 7279 ms | — | 100.00% | — | — | 24 |
| 21 | fireworks | 6 | 7606 ms | — | 100.00% | — | — | 28 |
| 22 | deepseek | 2 | 9205 ms | — | 100.00% | — | — | 25 |
| 23 | xiaomi | 4 | 9871 ms | — | 42.86% | provider_error 57% |
— | 21 |
| 24 | lightning | 1 | 9965 ms | — | 100.00% | — | — | 30 |
| 25 | novita | 23 | 10034 ms | — | 89.66% | provider_error 10% |
— | 29 |
| 26 | wafer | 8 | 10662 ms | — | 52.00% | provider_error 48% |
— | 25 |
| 27 | anthropic | 10 | 10678 ms | 69 tok/s | 67.65% | provider_error 32% |
— | 34 |
Models
Models sampled in the 5,000-sample benchmark set, fastest measured TTFT first. Rows with few samples are marked — more data sharpens the numbers.
| # | Model | Provider | p50 TTFT | p95 TTFT | p50 TTFB | Throughput | Uptime | Config excluded | Samples |
|---|---|---|---|---|---|---|---|---|---|
| 1 | nvidia/nemotron-3-nano-omni-reasoning-30b-a3b limited data | crusoe | 715 ms | 10150 ms | 714 ms | — | 66.67% | — | 3 |
| 2 | mistralai/mistral-small-3.2-24b-instruct limited data | mistral | 717 ms | 3558 ms | 716 ms | — | 100.00% | — | 2 |
| 3 | google/gemma-4-31b-it limited data | deepinfra | 803 ms | 17006 ms | 803 ms | — | 100.00% | — | 2 |
| 4 | NousResearch/Hermes-4-405B limited data | nebius | 857 ms | 857 ms | 856 ms | — | 100.00% | — | 1 |
| 5 | google/gemma-4-26b-a4b-it limited data | parasail | 860 ms | 21233 ms | 860 ms | — | 100.00% | — | 2 |
| 6 | NousResearch/Hermes-4-70B limited data | nebius | 864 ms | 864 ms | 863 ms | — | 100.00% | — | 1 |
| 7 | deepseek/deepseek-v4-flash limited data | crusoe | 997 ms | 997 ms | 997 ms | — | 100.00% | — | 1 |
| 8 | qwen/qwen3-vl-235b-a22b-instruct limited data | novita | 998 ms | 998 ms | 998 ms | — | 100.00% | — | 1 |
| 9 | qwen/qwen3-vl-8b-instruct limited data | parasail | 1006 ms | 7963 ms | 1005 ms | — | 100.00% | — | 2 |
| 10 | google/gemma-3-4b-it limited data | deepinfra | 1060 ms | 17469 ms | 1059 ms | — | 100.00% | — | 6 |
| 11 | lgai-exaone/k-exaone-236b-a23b limited data | friendli | 1069 ms | 9267 ms | 1069 ms | — | 100.00% | — | 3 |
| 12 | qwen/qwen3-235b-a22b-fp8 limited data | novita | 1070 ms | 1070 ms | 1069 ms | — | 50.00% | — | 2 |
| 13 | z-ai/glm-4.7-flash limited data | venice | 1124 ms | 8940 ms | 1123 ms | — | 100.00% | — | 2 |
| 14 | z-ai/glm-4.5v limited data | zai | 1139 ms | 17360 ms | 1138 ms | — | 100.00% | — | 2 |
| 15 | meta-llama/llama-3.1-8b-instruct limited data | friendli | 1146 ms | 1146 ms | 1145 ms | — | 100.00% | — | 1 |
| 16 | deepseek/deepseek-v3.2 limited data | friendli | 1268 ms | 16387 ms | 1268 ms | — | 100.00% | — | 6 |
| 17 | inclusionai/ring-2.6-1t limited data | novita | 1283 ms | 1283 ms | 1282 ms | — | 100.00% | — | 1 |
| 18 | google/gemma-4-26b-a4b-it limited data | deepinfra | 1322 ms | 4207 ms | 1322 ms | — | 100.00% | — | 3 |
| 19 | kwaipilot/kat-coder-pro limited data | novita | 1465 ms | 1465 ms | 1464 ms | — | 100.00% | — | 1 |
| 20 | z-ai/glm-5.1 limited data | baseten | 1468 ms | 3226 ms | 1467 ms | — | 100.00% | — | 3 |
| 21 | meta-llama/llama-3.1-70b-instruct limited data | deepinfra | 1538 ms | 11846 ms | 1537 ms | — | 100.00% | — | 2 |
| 22 | z-ai/glm-4.7 limited data | baseten | 1587 ms | 14789 ms | 1587 ms | — | 100.00% | — | 4 |
| 23 | moonshotai/kimi-k2.6 limited data | fireworks | 1622 ms | 7386 ms | 1621 ms | — | 100.00% | — | 5 |
| 24 | z-ai/glm-5.1 limited data | venice | 1696 ms | 8620 ms | 1696 ms | — | 100.00% | — | 2 |
| 25 | qwen/qwen2.5-vl-72b-instruct limited data | parasail | 1714 ms | 1714 ms | 1713 ms | — | 100.00% | — | 1 |
| 26 | nvidia/Llama-3_1-Nemotron-Ultra-253B-v1 limited data | nebius | 1783 ms | 1783 ms | 1782 ms | — | 100.00% | — | 1 |
| 27 | tencent/hunyuan-a13b-instruct limited data | siliconflow | 1825 ms | 4892 ms | 1824 ms | — | 100.00% | — | 3 |
| 28 | z-ai/glm-5.2 limited data | baseten | 1841 ms | 4255 ms | 1841 ms | — | 100.00% | — | 2 |
| 29 | nvidia/nemotron-3-super-120b-a12b limited data | crusoe | 1841 ms | 1841 ms | 1841 ms | — | 100.00% | 2 probe_config_error |
1 |
| 30 | bytedance/ui-tars-1.5-7b limited data | parasail | 1857 ms | 8426 ms | 1857 ms | — | 100.00% | — | 4 |
| 31 | openai/gpt-4.1-nano limited data | openai | 1890 ms | 17591 ms | 1889 ms | — | 100.00% | — | 4 |
| 32 | google/gemini-2.5-flash limited data | gemini | 1911 ms | 13973 ms | 1910 ms | — | 100.00% | — | 5 |
| 33 | minimax/minimax-m2.5 limited data | friendli | 1916 ms | 7643 ms | 1915 ms | — | 100.00% | — | 3 |
| 34 | arcee-ai/trinity-large-thinking limited data | parasail | 1953 ms | 1953 ms | 1952 ms | — | 100.00% | — | 1 |
| 35 | qwen/qwen3.6-27b limited data | venice | 1965 ms | 12049 ms | 1964 ms | — | 100.00% | — | 4 |
| 36 | minimax/minimax-m2.1-highspeed limited data | minimax | 1998 ms | 3621 ms | 1997 ms | — | 100.00% | — | 7 |
| 37 | deepseek/deepseek-v3.1-terminus limited data | novita | 2013 ms | 2013 ms | 2012 ms | — | 100.00% | — | 1 |
| 38 | qwen/qwen-2.5-7b-instruct limited data | together | 2021 ms | 18068 ms | 2020 ms | — | 100.00% | — | 6 |
| 39 | minimax/minimax-m2.7-highspeed limited data | minimax | 2037 ms | 14266 ms | 2036 ms | — | 100.00% | — | 5 |
| 40 | qwen/qwen3-235b-a22b-instruct-2507 limited data | novita | 2074 ms | 2074 ms | 2073 ms | — | 100.00% | — | 1 |
| 41 | z-ai/glm-5v-turbo limited data | venice | 2141 ms | 5015 ms | 2140 ms | — | 100.00% | — | 2 |
| 42 | z-ai/glm-5 limited data | venice | 2179 ms | 17012 ms | 2179 ms | — | 100.00% | — | 6 |
| 43 | z-ai/glm-5 limited data | siliconflow | 2197 ms | 13315 ms | 2197 ms | — | 100.00% | — | 5 |
| 44 | mistralai/ministral-3b-2512 limited data | mistral | 2204 ms | 13677 ms | 2203 ms | — | 100.00% | — | 4 |
| 45 | z-ai/glm-5 limited data | zai | 2206 ms | 3256 ms | 2205 ms | — | 100.00% | — | 2 |
| 46 | google/gemini-3.1-flash-lite limited data | gemini | 2207 ms | 5862 ms | 2206 ms | — | 100.00% | — | 4 |
| 47 | z-ai/glm-5.2 limited data | zai | 2240 ms | 2240 ms | 2239 ms | — | 100.00% | — | 1 |
| 48 | openai/gpt-oss-120b limited data | novita | 2291 ms | 2291 ms | 2290 ms | — | 100.00% | — | 1 |
| 49 | moonshotai/kimi-k2.6 limited data | phala | 2293 ms | 2293 ms | 2292 ms | — | 100.00% | — | 1 |
| 50 | openai/gpt-oss-120b limited data | phala | 2321 ms | 5819 ms | 2320 ms | — | 100.00% | — | 2 |
| 51 | openai/gpt-oss-120b limited data | fireworks | 2335 ms | 16454 ms | 2334 ms | — | 100.00% | — | 4 |
| 52 | z-ai/glm-4.6 limited data | zai | 2369 ms | 3962 ms | 2369 ms | — | 100.00% | — | 2 |
| 53 | z-ai/glm-5 limited data | baseten | 2388 ms | 6602 ms | 2388 ms | — | 100.00% | — | 3 |
| 54 | meta-llama/llama-3.3-70b-instruct limited data | tinfoil | 2431 ms | 10351 ms | 2430 ms | — | 100.00% | — | 4 |
| 55 | openai/gpt-4.1-mini limited data | openai | 2431 ms | 8760 ms | 2430 ms | — | 100.00% | — | 2 |
| 56 | qwen/qwen3-omni-30b-a3b-instruct limited data | novita | 2476 ms | 2476 ms | 2475 ms | — | 100.00% | — | 1 |
| 57 | z-ai/glm-4.7 limited data | venice | 2659 ms | 2659 ms | 2659 ms | — | 100.00% | — | 1 |
| 58 | z-ai/glm-5.2 limited data | phala | 2681 ms | 2681 ms | 2680 ms | — | 100.00% | — | 1 |
| 59 | moonshotai/kimi-k2.6 limited data | parasail | 2779 ms | 2779 ms | 2778 ms | — | 100.00% | — | 1 |
| 60 | moonshotai/kimi-k2.5 limited data | kimi | 2787 ms | 19478 ms | 2786 ms | — | 100.00% | — | 19 |
| 61 | mistralai/mistral-large limited data | mistral | 2810 ms | 13945 ms | 2810 ms | — | 100.00% | — | 2 |
| 62 | z-ai/glm-4.6v limited data | zai | 2830 ms | 16667 ms | 2829 ms | — | 100.00% | — | 5 |
| 63 | google/gemini-2.5-flash-lite limited data | gemini | 2834 ms | 10319 ms | 2833 ms | — | 100.00% | — | 5 |
| 64 | deepseek/deepseek-v3-0324 limited data | crusoe | 2880 ms | 2880 ms | 2879 ms | — | 100.00% | — | 1 |
| 65 | minimax/minimax-m2 limited data | minimax | 2904 ms | 11202 ms | 2903 ms | — | 100.00% | — | 6 |
| 66 | deepseek/deepseek-v4-flash limited data | siliconflow | 2935 ms | 11913 ms | 2934 ms | — | 100.00% | — | 5 |
| 67 | minimax/minimax-m2.5-highspeed limited data | minimax | 2957 ms | 10231 ms | 2957 ms | — | 100.00% | — | 2 |
| 68 | openai/gpt-oss-120b limited data | cerebras | 3063 ms | 16691 ms | 3063 ms | — | 100.00% | — | 6 |
| 69 | mistralai/mistral-small-2603 limited data | mistral | 3078 ms | 8816 ms | 3077 ms | — | 100.00% | — | 5 |
| 70 | minimax/minimax-m2.5 limited data | phala | 3101 ms | 3101 ms | 3101 ms | — | 100.00% | — | 1 |
| 71 | openai/o4-mini limited data | openai | 3154 ms | 13986 ms | 3154 ms | — | 100.00% | — | 2 |
| 72 | moonshotai/kimi-k2.7-code limited data | wafer | 3190 ms | 14425 ms | 3188 ms | — | 100.00% | — | 3 |
| 73 | anthropic/claude-opus-4.8 limited data | anthropic | 3207 ms | 11204 ms | 3206 ms | — | 100.00% | — | 2 |
| 74 | Qwen/Qwen3-Next-80B-A3B-Thinking limited data | nebius | 3346 ms | 3346 ms | 3346 ms | — | 100.00% | — | 1 |
| 75 | moonshotai/kimi-k2.5 limited data | phala | 3367 ms | 11252 ms | 3367 ms | — | 80.00% | — | 5 |
| 76 | mistralai/ministral-8b-2512 limited data | mistral | 3559 ms | 10757 ms | 3559 ms | — | 100.00% | — | 3 |
| 77 | z-ai/glm-4.7 limited data | zai | 3562 ms | 15811 ms | 3562 ms | — | 100.00% | — | 4 |
| 78 | deepseek/deepseek-v3.2 limited data | novita | 3646 ms | 3646 ms | 3646 ms | — | 100.00% | — | 1 |
| 79 | z-ai/glm-5.2 limited data | together | 3744 ms | 12489 ms | 3743 ms | — | 100.00% | — | 6 |
| 80 | meta-llama/llama-3.3-70b-instruct limited data | crusoe | 3763 ms | 7789 ms | 3763 ms | — | 100.00% | — | 4 |
| 81 | qwen/qwen3-coder-next limited data | parasail | 3788 ms | 3788 ms | 3787 ms | — | 100.00% | — | 1 |
| 82 | anthropic/claude-sonnet-4.5 limited data | anthropic | 3938 ms | 8344 ms | 3937 ms | — | 100.00% | — | 2 |
| 83 | z-ai/glm-5v-turbo limited data | zai | 4088 ms | 5448 ms | 4088 ms | — | 100.00% | — | 3 |
| 84 | Sao10K/L3-8B-Stheno-v3.2 limited data | novita | 4089 ms | 4089 ms | 4088 ms | — | 100.00% | — | 1 |
| 85 | nvidia/nvidia-nemotron-3-ultra-550b-a55b limited data | baseten | 4099 ms | 15071 ms | 4098 ms | — | 100.00% | — | 2 |
| 86 | anthropic/claude-opus-4.6 limited data | anthropic | 4100 ms | 18298 ms | 4100 ms | — | 100.00% | — | 2 |
| 87 | z-ai/glm-4.5 limited data | zai | 4142 ms | 7569 ms | 4141 ms | — | 100.00% | — | 3 |
| 88 | qwen/qwen3.7-max limited data | wafer | 4155 ms | 4155 ms | 4154 ms | — | 100.00% | — | 1 |
| 89 | meta-llama/llama-3.3-70b-instruct limited data | together | 4318 ms | 13880 ms | 4318 ms | — | 100.00% | — | 5 |
| 90 | z-ai/glm-4.6 limited data | venice | 4444 ms | 4444 ms | 4443 ms | — | 100.00% | — | 1 |
| 91 | google/gemma-4-31b-it limited data | tinfoil | 4606 ms | 12712 ms | 4606 ms | — | 85.71% | — | 7 |
| 92 | google/gemini-3-flash-preview limited data | gemini | 4633 ms | 12792 ms | 4632 ms | — | 100.00% | — | 2 |
| 93 | anthropic/claude-opus-4.5 limited data | anthropic | 4910 ms | 9412 ms | 4910 ms | — | 100.00% | — | 3 |
| 94 | openai/gpt-4o-mini limited data | openai | 5053 ms | 10008 ms | 5052 ms | — | 100.00% | — | 2 |
| 95 | deepseek-ai/DeepSeek-V4-Pro limited data | nebius | 5150 ms | 11021 ms | 5149 ms | — | 100.00% | — | 2 |
| 96 | deepseek/deepseek-v4-pro limited data | siliconflow | 5282 ms | 5282 ms | 5282 ms | — | 100.00% | — | 1 |
| 97 | openai/o3 limited data | openai | 5361 ms | 15040 ms | 5361 ms | — | 100.00% | — | 4 |
| 98 | moonshotai/kimi-k2.7-code limited data | baseten | 5384 ms | 20537 ms | 5384 ms | — | 100.00% | — | 4 |
| 99 | deepseek/deepseek-v4-pro limited data | tinfoil | 5446 ms | 11481 ms | 5446 ms | — | 100.00% | — | 5 |
| 100 | x-ai/grok-4.3 limited data | grok | 5641 ms | 14745 ms | 5639 ms | — | 100.00% | — | 11 |
| 101 | x-ai/grok-4.20 limited data | grok | 5728 ms | 16233 ms | 5727 ms | — | 100.00% | — | 12 |
| 102 | Qwen/Qwen3-30B-A3B-Instruct-2507 limited data | nebius | 5728 ms | 5728 ms | 5727 ms | — | 100.00% | — | 1 |
| 103 | openai/gpt-oss-120b limited data | parasail | 5760 ms | 6901 ms | 5759 ms | — | 100.00% | — | 3 |
| 104 | openai/o1 limited data | openai | 5830 ms | 22327 ms | 5830 ms | — | 100.00% | — | 4 |
| 105 | deepseek/deepseek-v4-flash limited data | deepseek | 5900 ms | 30253 ms | 5900 ms | — | 100.00% | — | 12 |
| 106 | z-ai/glm-5.2 limited data | gmi | 5909 ms | 16130 ms | 5909 ms | — | 100.00% | — | 8 |
| 107 | mistralai/ministral-14b-2512 limited data | mistral | 5929 ms | 14323 ms | 5928 ms | — | 100.00% | — | 10 |
| 108 | z-ai/glm-4.5-air:free limited data | zai | 5937 ms | 14366 ms | 5936 ms | — | 100.00% | — | 3 |
| 109 | openai/gpt-oss-120b limited data | nebius | 5992 ms | 9396 ms | 5992 ms | — | 100.00% | — | 4 |
| 110 | z-ai/glm-5.1 limited data | crusoe | 6129 ms | 16883 ms | 6129 ms | — | 100.00% | — | 2 |
| 111 | cerebras/zai-glm-4.7 limited data | cerebras | 6282 ms | 16249 ms | 6282 ms | — | 100.00% | — | 10 |
| 112 | google/gemma-3-12b-it limited data | deepinfra | 6404 ms | 14358 ms | 6403 ms | — | 100.00% | — | 5 |
| 113 | z-ai/glm-5.2 limited data | tinfoil | 6470 ms | 25624 ms | 6470 ms | — | 100.00% | — | 5 |
| 114 | z-ai/glm-5v-turbo limited data | siliconflow | 6740 ms | 14240 ms | 6740 ms | — | 100.00% | — | 2 |
| 115 | google/gemma-4-31b-it limited data | crusoe | 6752 ms | 10092 ms | 6752 ms | — | 100.00% | — | 2 |
| 116 | zai-org/glm-4.6v limited data | novita | 6915 ms | 6915 ms | 6914 ms | — | 100.00% | — | 1 |
| 117 | moonshotai/kimi-k2.6 limited data | together | 7079 ms | 8632 ms | 7079 ms | — | 100.00% | — | 5 |
| 118 | deepseek/deepseek-v4-pro limited data | gmi | 7279 ms | 27243 ms | 7279 ms | — | 100.00% | — | 7 |
| 119 | minimax/minimax-m3 limited data | siliconflow | 7311 ms | 16744 ms | 7310 ms | — | 60.00% | — | 5 |
| 120 | qwen/qwen3.5-397b-a17b limited data | phala | 7325 ms | 7325 ms | 7325 ms | — | 100.00% | — | 1 |
| 121 | moonshotai/kimi-k2.6 limited data | kimi | 7369 ms | 24692 ms | 7368 ms | — | 100.00% | — | 13 |
| 122 | z-ai/glm-4.7 limited data | cerebras | 7408 ms | 13328 ms | 7407 ms | — | 100.00% | — | 5 |
| 123 | google/gemma-3-27b-it limited data | nebius | 7432 ms | 21741 ms | 7431 ms | — | 100.00% | — | 3 |
| 124 | google/gemini-3.5-flash limited data | gemini | 7446 ms | 14372 ms | 7445 ms | — | 100.00% | — | 4 |
| 125 | google/gemini-3.1-flash-lite-preview limited data | gemini | 7567 ms | 17544 ms | 7567 ms | — | 100.00% | — | 6 |
| 126 | yutori/n1.5 limited data | crusoe | 7590 ms | 7590 ms | 7589 ms | — | 100.00% | — | 1 |
| 127 | z-ai/glm-5.2 limited data | fireworks | 7606 ms | 19536 ms | 7605 ms | — | 100.00% | — | 6 |
| 128 | openai/gpt-oss-120b limited data | tinfoil | 7700 ms | 16717 ms | 7699 ms | — | 100.00% | — | 6 |
| 129 | qwen/qwen3.5-27b limited data | phala | 8213 ms | 8213 ms | 8212 ms | — | 100.00% | — | 1 |
| 130 | z-ai/glm-5.2 limited data | parasail | 8376 ms | 8376 ms | 8376 ms | — | 100.00% | — | 1 |
| 131 | mistralai/mistral-nemo limited data | mistral | 8707 ms | 18132 ms | 8706 ms | — | 100.00% | — | 6 |
| 132 | minimax/minimax-m2.7 limited data | minimax | 8732 ms | 15619 ms | 8731 ms | — | 85.71% | — | 7 |
| 133 | nvidia/nemotron-120b-a12b limited data | baseten | 8751 ms | 27227 ms | 8751 ms | — | 100.00% | — | 2 |
| 134 | z-ai/glm-5.1 limited data | friendli | 8914 ms | 14707 ms | 8912 ms | — | 100.00% | — | 5 |
| 135 | moonshotai/kimi-k2.6 limited data | tinfoil | 9038 ms | 28949 ms | 9037 ms | — | 100.00% | — | 4 |
| 136 | deepseek/deepseek-v4-pro limited data | deepseek | 9205 ms | 18144 ms | 9205 ms | — | 100.00% | — | 13 |
| 137 | nvidia/nemotron-3-ultra-550b limited data | crusoe | 9222 ms | 24256 ms | 9221 ms | — | 100.00% | — | 2 |
| 138 | z-ai/glm-5.2 limited data | friendli | 9268 ms | 9268 ms | 9267 ms | — | 100.00% | — | 1 |
| 139 | xiaomi/mimo-v2.5 limited data | xiaomi | 9353 ms | 16113 ms | 9352 ms | — | 100.00% | — | 4 |
| 140 | z-ai/glm-5.1 limited data | fireworks | 9442 ms | 16683 ms | 9433 ms | — | 100.00% | — | 8 |
| 141 | cerebras/gpt-oss-120b limited data | cerebras | 9503 ms | 11306 ms | 9502 ms | — | 100.00% | — | 3 |
| 142 | deepseek/deepseek-v4-pro limited data | baseten | 9542 ms | 12429 ms | 9541 ms | — | 100.00% | — | 2 |
| 143 | zai-org/glm-4.7 limited data | novita | 9763 ms | 9763 ms | 9762 ms | — | 100.00% | — | 1 |
| 144 | google/gemma-3-27b-it limited data | deepinfra | 9797 ms | 10445 ms | 9797 ms | — | 100.00% | — | 3 |
| 145 | z-ai/glm-5.2 limited data | deepinfra | 9828 ms | 9828 ms | 9828 ms | — | 100.00% | — | 1 |
| 146 | xiaomi/mimo-v2.5-pro-ultraspeed limited data | xiaomi | 9871 ms | 27967 ms | 9871 ms | — | 100.00% | — | 5 |
| 147 | google/gemma-4-31b-it | lightning | 9965 ms | 20757 ms | 9964 ms | — | 100.00% | — | 30 |
| 148 | google/gemma-4-26b-a4b-it limited data | novita | 10034 ms | 17816 ms | 10033 ms | — | 100.00% | — | 2 |
| 149 | minimax/minimax-m2.1 limited data | novita | 10130 ms | 10130 ms | 10129 ms | — | 100.00% | — | 1 |
| 150 | moonshotai/kimi-k2.5 limited data | fireworks | 10338 ms | 10338 ms | 10338 ms | — | 100.00% | — | 1 |
| 151 | minimax/minimax-m3 limited data | wafer | 10662 ms | 16853 ms | 10661 ms | — | 100.00% | — | 3 |
| 152 | anthropic/claude-haiku-4.5 limited data | anthropic | 10678 ms | 13807 ms | 10677 ms | — | 100.00% | — | 5 |
| 153 | qwen/qwen3.6-35b-a3b limited data | wafer | 10770 ms | 15516 ms | 10769 ms | — | 100.00% | — | 6 |
| 154 | openai/gpt-5.4-mini limited data | openai | 10790 ms | 17874 ms | 10789 ms | — | 100.00% | — | 4 |
| 155 | qwen/qwen3-235b-a22b-2507 limited data | crusoe | 10810 ms | 17013 ms | 10810 ms | — | 100.00% | — | 3 |
| 156 | qwen/qwen3-235b-a22b-2507 limited data | friendli | 11105 ms | 11982 ms | 11105 ms | — | 100.00% | — | 3 |
| 157 | nvidia/nemotron-3-super-120b-a12b limited data | nebius | 11109 ms | 11109 ms | 11109 ms | — | 100.00% | — | 1 |
| 158 | minimax/minimax-m3 limited data | minimax | 11156 ms | 14917 ms | 11155 ms | — | 100.00% | — | 6 |
| 159 | Qwen/Qwen3-235B-A22B-Instruct-2507 limited data | nebius | 11391 ms | 18231 ms | 11390 ms | — | 100.00% | — | 2 |
| 160 | minimax/minimax-m2 limited data | novita | 11566 ms | 11566 ms | 11565 ms | — | 100.00% | — | 1 |
| 161 | baidu/ernie-4.5-vl-424b-a47b limited data | novita | 11748 ms | 21939 ms | 11747 ms | — | 100.00% | — | 3 |
| 162 | z-ai/glm-5-turbo limited data | venice | 12048 ms | 26285 ms | 12048 ms | — | 100.00% | — | 2 |
| 163 | z-ai/glm-4.5-air limited data | zai | 12453 ms | 22696 ms | 12453 ms | — | 100.00% | — | 5 |
| 164 | qwen/qwen3-30b-a3b-instruct-2507 limited data | phala | 12574 ms | 15671 ms | 12573 ms | — | 100.00% | — | 2 |
| 165 | qwen/qwen3.5-9b limited data | venice | 12627 ms | 12627 ms | 12626 ms | — | 100.00% | — | 1 |
| 166 | openai/gpt-5.5 limited data | openai | 12718 ms | 13556 ms | 12718 ms | — | 100.00% | — | 3 |
| 167 | thedrummer/skyfall-36b-v2 limited data | parasail | 12806 ms | 12806 ms | 12805 ms | — | 100.00% | — | 1 |
| 168 | openai/gpt-4o limited data | openai | 13035 ms | 13035 ms | 13034 ms | — | 100.00% | — | 1 |
| 169 | z-ai/glm-5.2 limited data | venice | 13061 ms | 16141 ms | 13060 ms | — | 83.33% | — | 6 |
| 170 | anthropic/claude-sonnet-4.6 limited data | anthropic | 13080 ms | 17113 ms | 13079 ms | — | 100.00% | — | 2 |
| 171 | meta-llama/llama-4-scout-17b-16e-instruct limited data | novita | 13160 ms | 13703 ms | 13159 ms | — | 100.00% | — | 3 |
| 172 | deepseek/deepseek-v4-pro limited data | fireworks | 13273 ms | 15099 ms | 13272 ms | — | 100.00% | — | 4 |
| 173 | mistralai/mistral-medium-3-5 limited data | mistral | 13450 ms | 17254 ms | 13450 ms | — | 100.00% | — | 3 |
| 174 | z-ai/glm-5.2 limited data | siliconflow | 13577 ms | 24924 ms | 13577 ms | — | 100.00% | — | 2 |
| 175 | openai/o3-mini limited data | openai | 13761 ms | 14415 ms | 13761 ms | — | 100.00% | — | 3 |
| 176 | moonshotai/kimi-k2.5 limited data | baseten | 13881 ms | 14537 ms | 13881 ms | — | 100.00% | — | 2 |
| 177 | qwen/qwen-mt-plus limited data | novita | 13900 ms | 13900 ms | 13900 ms | — | 100.00% | — | 1 |
| 178 | qwen/qwen3.5-27b limited data | deepinfra | 13981 ms | 15164 ms | 13980 ms | — | 100.00% | — | 3 |
| 179 | z-ai/glm-5.1 limited data | gmi | 14167 ms | 16611 ms | 14167 ms | — | 100.00% | — | 7 |
| 180 | anthropic/claude-opus-4.7 limited data | anthropic | 14599 ms | 18186 ms | 14598 ms | 69 tok/s | 100.00% | — | 7 |
| 181 | openai/gpt-oss-120b limited data | crusoe | 14896 ms | 14896 ms | 14895 ms | — | 100.00% | — | 1 |
| 182 | z-ai/glm-5.1 limited data | zai | 15113 ms | 15113 ms | 15113 ms | — | 100.00% | — | 1 |
| 183 | openai/gpt-oss-20b limited data | parasail | 15700 ms | 15700 ms | 15700 ms | — | 100.00% | — | 1 |
| 184 | openai/gpt-oss-20b limited data | phala | 15711 ms | 15711 ms | 15711 ms | — | 100.00% | — | 1 |
| 185 | zai-org/GLM-5.1 limited data | nebius | 16398 ms | 16398 ms | 16397 ms | — | 100.00% | — | 1 |
| 186 | mistralai/mistral-small-3.2-24b-instruct limited data | parasail | 16761 ms | 16761 ms | 16760 ms | — | 100.00% | — | 1 |
| 187 | qwen/qwen3-vl-235b-a22b-thinking limited data | novita | 16772 ms | 16772 ms | 16772 ms | — | 100.00% | — | 1 |
| 188 | z-ai/glm-5 limited data | gmi | 17286 ms | 35532 ms | 17285 ms | — | 100.00% | — | 2 |
| 189 | openai/gpt-oss-120b limited data | baseten | 18011 ms | 18011 ms | 18011 ms | — | 100.00% | — | 1 |
| 190 | qwen/qwen-2.5-7b-instruct limited data | phala | 18016 ms | 18016 ms | 18015 ms | — | 100.00% | — | 1 |
| 191 | openai/gpt-4.1 limited data | openai | 19886 ms | 19886 ms | 19886 ms | — | 100.00% | — | 1 |
| 192 | qwen/qwen3.5-122b-a10b limited data | novita | 20610 ms | 20610 ms | 20609 ms | — | 100.00% | — | 1 |
| 193 | deepseek/deepseek-v3.2 limited data | phala | 22315 ms | 22315 ms | 22315 ms | — | 100.00% | — | 1 |
| 194 | deepseek/deepseek-r1-turbo limited data | novita | 23584 ms | 23584 ms | 23583 ms | — | 100.00% | — | 1 |
| 195 | z-ai/glm-4.7-flash limited data | phala | — | — | — | — | 0.00% | — | 2 |
| 196 | anthropic/claude-opus-4 limited data | anthropic | — | — | — | — | 0.00% | — | 4 |
| 197 | meta-llama/llama-3.3-70b-instruct limited data | friendli | — | — | — | — | 0.00% | — | 2 |
| 198 | qwen/qwen3.5-397b-a17b limited data | wafer | — | — | — | — | 0.00% | — | 2 |
| 199 | openai/text-embedding-3-large limited data | openai | — | — | — | — | 100.00% | — | 1 |
| 200 | xiaomi/mimo-v2-flash limited data | xiaomi | — | — | — | — | 0.00% | — | 6 |
| 201 | anthropic/claude-opus-4.1 limited data | anthropic | — | — | — | — | 0.00% | — | 4 |
| 202 | xiaomi/mimo-v2-pro limited data | xiaomi | — | — | — | — | 0.00% | — | 6 |
| 203 | deepseek/deepseek-v4-pro limited data | wafer | — | — | — | — | 0.00% | — | 5 |
| 204 | zai-org/glm-4.5 limited data | novita | — | — | — | — | 0.00% | — | 1 |
| 205 | anthropic/claude-sonnet-4 limited data | anthropic | — | — | — | — | 0.00% | — | 3 |
| 206 | deepseek/deepseek-v4-flash limited data | wafer | — | — | — | — | 0.00% | — | 2 |
| 207 | moonshotai/kimi-k2.6 limited data | wafer | — | — | — | — | 0.00% | — | 3 |
| 208 | moonshotai/kimi-k2.6 limited data | crusoe | — | — | — | — | 0.00% | — | 3 |
| 209 | google/gemma-4-31b-it limited data | parasail | — | — | — | — | 0.00% | — | 1 |
| 210 | baidu/ernie-4.5-21B-a3b limited data | novita | — | — | — | — | 0.00% | — | 1 |