OpenAI compatible API. Attested gateway. Public status.

Venice

Venice models on TrustedRouter with prices, routes, policy notes, and source links.

Verify gateway

1 URLbase_url migration

100smodels and routes

0prompt logs by default

`venice`

Confidential

All providers

Provider	Venice
Models	12 public models
Prepaid routes	12
BYOK routes	12
Zero data retention	yes
Confidential compute	yes
Provider E2EE	yes
Policy note	Tracked as confidential — Venice documents no logging or storage of prompts/responses plus TEE-isolated, end-to-end-encrypted inference. (Caveat: requests Venice proxies to external frontier models inherit those providers' policies; TR routes Venice-native open models here.) Policy source

Measured performance

259 samples

Continuously sampled across Venice's routed models — p50 TTFT, throughput, and success rate. Unsupported route and probe-configuration rows are separated from provider downtime. No prompt or output content stored.

p50 TTFT	6521 ms
Throughput	—
Uptime	98.07%

Model	p50 TTFT	p50 TTFB	Throughput	Uptime	Config excluded	Samples
qwen/qwen3-235b-a22b-thinking-2507	1870 ms	1869 ms	—	100.00%	—	15
qwen/qwen3.6-27b	1965 ms	1964 ms	—	100.00%	—	24
z-ai/glm-4.7	3258 ms	3257 ms	—	100.00%	—	23
qwen/qwen3.5-9b	5669 ms	5669 ms	—	94.74%	—	19
qwen/qwen3.5-397b-a17b	5933 ms	5932 ms	—	95.00%	—	20
z-ai/glm-5	6521 ms	6520 ms	—	100.00%	—	29
z-ai/glm-4.6	6525 ms	6524 ms	—	100.00%	—	17
z-ai/glm-5.1	6838 ms	6837 ms	—	100.00%	—	21
z-ai/glm-5.2	7477 ms	7476 ms	—	96.30%	—	27
z-ai/glm-4.7-flash	7606 ms	7605 ms	—	100.00%	—	20
z-ai/glm-5v-turbo	7649 ms	7648 ms	—	100.00%	—	22
z-ai/glm-5-turbo	10239 ms	10239 ms	—	90.91%	—	22

Full provider & model leaderboard.

Provider models

Models served by Venice.

Each row links to pricing, provider, benchmark, and API pages for the model.

Model	AI IQ	Context	Endpoints	Prompt	Completion	Routes
`qwen/qwen3-235b-a22b-thinking-2507` Qwen: Qwen3 235B A22B Thinking 2507 benchmarks performance api	—	262,144	2	$0.495/1M	$3.85/1M	prepaid BYOK
`qwen/qwen3.5-397b-a17b` Qwen: Qwen3.5 397B A17B benchmarks performance api	—	262,144	2	$0.825/1M	$4.95/1M	prepaid BYOK
`qwen/qwen3.5-9b` Qwen: Qwen3.5-9B benchmarks performance api	IQ 95#61	262,144	2	$0.11/1M	$0.165/1M	prepaid BYOK
`qwen/qwen3.6-27b` Qwen: Qwen3.6 27B benchmarks performance api	IQ 104#41	262,144	2	$0.363/1M	$3.575/1M	prepaid BYOK
`z-ai/glm-4.6` Z.ai: GLM 4.6 benchmarks performance api	—	202,752	2	$0.935/1M	$3.025/1M	prepaid BYOK
`z-ai/glm-4.7` Z.ai: GLM 4.7 benchmarks performance api	IQ 102#46	202,752	2	$0.605/1M	$2.915/1M	prepaid BYOK
`z-ai/glm-4.7-flash` Z.ai: GLM 4.7 Flash benchmarks performance api	—	202,752	2	$0.143/1M	$0.55/1M	prepaid BYOK
`z-ai/glm-5` Z.ai: GLM 5 benchmarks performance api	IQ 107#34	204,800	2	$1.1/1M	$3.52/1M	prepaid BYOK
`z-ai/glm-5-turbo` Z.ai: GLM 5 Turbo benchmarks performance api	—	202,752	2	$1.32/1M	$4.4/1M	prepaid BYOK
`z-ai/glm-5.1` Z.ai: GLM 5.1 benchmarks performance api	IQ 113#19	202,752	2	$1.925/1M	$6.05/1M	prepaid BYOK
`z-ai/glm-5.2` GLM 5.2 benchmarks performance api	IQ 117#10	1,048,576	2	$1.54/1M	$4.84/1M	prepaid BYOK
`z-ai/glm-5v-turbo` Z.ai: GLM 5V Turbo benchmarks performance api	—	202,752	2	$1.65/1M	$5.5/1M	prepaid BYOK