Synth beats Fable 5: introducing Iris, Zeus, and Prometheus

2026-06-24 · TrustedRouter-Fusion-Draco on GitHub

Synth — TrustedRouter's multi-model fusion, where a panel of models each answers a question, a judge weighs the answers, and a synthesizer writes the final one — now ships as three named presets. They share one fusion engine: a Kimi K2.6 judge and a GLM 5.2 synthesizer, the pairing our judge-and-synthesizer tests put on top. What changes between them is the panel. One model id each, the whole thing inside the attested gateway.

preset	model id	panel	DRACO	est. $ / 100 tasks
Iris 1.0	trustedrouter/iris	budget	62.6	~$20
Prometheus 1.0	trustedrouter/prometheus	all open-weights	69.2	~$34
Zeus 1.0	trustedrouter/zeus	commercial frontier	73.4	~$180

The three presets are the efficient frontier. Plot DRACO deep-research score against what it costs to run the whole 100-task benchmark and the three trace the upper-left edge — every standalone model, open or frontier, sits below them. Fable 5, the model OpenRouter built its best fusion on, scores 65.3 for an estimated $250 a run; Prometheus scores 69.2 for about $34. Synth beats Fable 5 by four points at roughly a seventh of the cost.

Prometheus is the one most people should reach for. Its panel is all open-weights — MiniMax M3, Kimi K2.6, GLM 5.2, Gemma 4, DeepSeek V4 Pro — so nothing in it is closed or priced like the frontier, and it still lands within four points of the best score we have ever measured while clearing every frontier solo: Opus 4.8 at 60.7, GPT-5.5 at 63.0, Fable 5 at 65.3. Near-frontier deep research at open-model cost.

Zeus is the ceiling. Put the commercial frontier on the panel, keep the same open-model judge and synthesizer, and Synth reaches 73.4 — the state of the art on DRACO, above OpenRouter's published best. It runs about five times the cost of Prometheus, so it is the preset for when the answer matters more than the bill. Iris is the cheapest way in — its panel is three open-weight models, MiniMax M3, Kimi K2.6, and DeepSeek V4 Pro, fused by the same Kimi judge and GLM synthesizer. 62.6 for about $20, above any single budget model.

Pick by id — trustedrouter/iris, trustedrouter/prometheus, or trustedrouter/zeus — on the same OpenAI-compatible API, with the panel, judge, and synthesis all running inside the attested gateway. Send the cheap prompts to Iris, the everyday hard ones to Prometheus, and the few that have to be right to Zeus.

Coding and agents get their own preset. trustedrouter/synth-code is the same fusion, tuned end to end for code. The whole harness is code-shaped: the panel runs a code-specific prompt, the judge is Kimi K2.7-code in place of the general Kimi K2.6, and the synthesizer works from a code-specific synthesis prompt — so how the drafts get written, weighed, and stitched into the final answer is all built around code instead of prose. A judge that knows code is better at telling a patch that compiles and passes from one that only reads well. Drop it into a coding agent the same way as the rest: trustedrouter/synth-code on the OpenAI-compatible API, panel, judge, and synthesis all inside the attested gateway. The presets above are graded on DRACO deep research; synth-code is the build for code and the agents that write it — for when you want several models to agree on the diff before it ships.

A note on the numbers. Scores are DRACO, graded the way the rest of this series is. The cost figures are estimates for a full 100-task run, derived from measured token usage and public per-token pricing and anchored to two we have published (an open model around $9 a run, Fable 5 around $250) — read them as order-of-magnitude, not invoices. The eval harness and per-task scores are public. Try Synth →