We're hiring We're looking for PhD researchers to join the team and work on exciting frontier problems. Get in touch →

TrustedRouter blog

Engineering notes on attested AI routing, Synth evals, provider privacy, and open source model routing.

Combo models are model containers visual summary
2026-06-28

Combo models are model containers

TrustedRouter now lets one model id package a graph of models: Synth panels, advisor models, selectors, and mapreduce flows. The API call still looks like one model. Inside, the attested gateway can route work across specialized models and return one answer.

Read →
Synth beats Fable 5: introducing Iris, Zeus, and Prometheus visual summary
2026-06-24

Synth beats Fable 5: introducing Iris, Zeus, and Prometheus

Synth now ships as three named presets — Iris 1.0 (trustedrouter/iris), Prometheus 1.0 (trustedrouter/prometheus), and Zeus 1.0 (trustedrouter/zeus) — one fusion engine, three panels. On a score-vs-cost chart of DRACO deep research they trace the efficient frontier: Prometheus scores 69.2 at open-model cost, beating Fable 5 (65.3) for roughly a seventh of the price; Zeus tops out at 73.4, the state of the art; Iris is the cheapest way in at 62.6. And for code and the agents that write it, trustedrouter/synth-code is the same fusion tuned end to end — code-specific panel and synthesis prompts and a code-tuned judge.

Read →

Source: TrustedRouter-Fusion-Draco on GitHub

Self-fusion's gain lives in the synthesizer, not the judge visual summary
2026-06-24

Self-fusion's gain lives in the synthesizer, not the judge

Self-fusion gives Sonnet 4.6 +8.0 on DRACO. We took it apart: hold the ten Sonnet drafts fixed and swap only the fuser to Haiku and the gain collapses to +2.2 — the fuser is the lever, not the drafts. Split the fuser into judge and synthesizer and run the 2×2: the synthesizer carries everything, the judge is nearly free. A cheap Haiku judge feeding a Sonnet synthesizer (+9.2) matches the all-Sonnet fuser. Spend on the one synthesis call; route the rest cheap.

Read →

Source: TrustedRouter-Fusion-Draco on GitHub

Fusion works now, even with the same model: self-fusion visual summary
2026-06-23

Fusion works now, even with the same model: self-fusion

Self-fusion — running one model several times and fusing its own answers — finally pays off: Sonnet 4.6 self-fuses +8.0 on DRACO deep research (significant), while Claude Haiku 4.5 barely moves (+2.6). Fusion stayed marginal for years because the synthesizer was the bottleneck; only now are cheap models good enough to keep the one right answer. And the parallel fan-out is exactly what a multi-provider router is for.

Read →

Source: TrustedRouter-Fusion-Draco on GitHub

Synth is two jobs, and no model wins both visual summary
2026-06-19

Synth is two jobs, and no model wins both

Synthesizing a model panel into one answer is two jobs — a judge that reads the panel and a synthesizer that writes the final answer — and the best model for each is a different one. Across the strongest open models, GLM-5.2 writes the best synthesized answer but judges its own writing worst; the best open synthesizer pairs a Kimi-k2.6 judge with a GLM-5.2 synthesizer, 73.4 on DRACO, beating any single model that does both jobs.

Read →

Source: TrustedRouter Synth Draco on GitHub

Four copies of a cheap model beat Fable at 1/7 the price visual summary
2026-06-18

Four copies of a cheap model beat Fable at 1/7 the price

Run MiniMax-M3 four times on a research task and synthesize the four reports, and the answer scores 68.1 on DRACO deep research — above Anthropic's frontier Fable 5 at 65.3, for about $37 against a modeled ~$250 for one Fable 5 run. Two runs gain nothing, four clear the frontier model, ten is the ceiling at 69.4: enough independent tries manufacture the diverse error synth needs, and a cheap model is cheap to run many times.

Read →

Source: TrustedRouter Synth Draco on GitHub

The most censored Chinese model is censored at the host, not the model visual summary
2026-06-18

The most censored Chinese model is censored at the host, not the model

The most-censored model on FreedomBench is GLM served by Z.ai, which goes silent on the plain facts Beijing censors. Run the identical open weights on Cerebras — or inside Tinfoil's sealed confidential enclave, where the host provably can't touch the prompt — and the blanks come back answered: GLM-5.2 goes from 30 of 60 to a clean sweep. The censorship lives in the API endpoint, not the model.

Read →

Source: FreedomBench on GitHub

Surpassing Frontier Performance with Open Source Synth at 1/3 the price visual summary
2026-06-17

Surpassing Frontier Performance with Open Source Synth at 1/3 the price

The best open-weights synthesizer — a Kimi-k2.6 judge feeding a GLM-5.2 synthesizer — scores 73.4 on DRACO, eight points over Anthropic's closed Fable 5 at 65.3; a fully-open five-model committee still beats Fable at 69.9 — for around $80 per hundred tasks against Fable's modeled ~$250, about a third the price. No frontier API touches the stack, and the whole thing runs on your own hardware.

Read →

Source: TrustedRouter Synth Draco on GitHub

Sign in

Choose a sign in method.