For neoclouds, AI clouds, and hardware providers

TrustedOS: the OS for AI clouds.

Dynamo and vLLM schedule your GPUs. TrustedOS runs your inference business — attested capacity, objective routing, metering, and high-margin models your customers can't get anywhere else.

30+providers routed today
3attested regions, multi-cloud
0prompt logs — verifiable
AMD SEV-SNPAttested confidential VMs on GCP Confidential Space.
Intel TDXTrust domains in our live gateway regions.
NVIDIA CCConfidential-computing path for H100/H200-class GPUs.
AWS Nitro EnclavesSame attested code path on a second cloud.
The business problem

Sell products, not GPU-hours.

Renting raw compute is a commodity racing to the cost floor. Industry modeling puts a 1,024-GPU cluster at roughly −$330k/month at 55% utilization and +$340k/month at 85% — the spread between renting and selling well is the whole margin.

Serving tokens beats renting the same GPU. Serving models nobody else has beats serving commodity open weights. TrustedOS is the layer that does both on your hardware.

Composite models are token multipliers: one customer request fans out inside the attested gateway to a panel of models, a judge, and a synthesizer — every inner call is billable inference that can land on your capacity first.

How composite models work Browse the catalog

One request, many inner callsPython
from openai import OpenAI

client = OpenAI(
    base_url="https://api.trustedrouter.com/v1",
    api_key="sk-tr-v1-...",
)

# A composite model id packages a whole graph:
# panel of open models -> judge -> synthesizer.
# Inner calls route to partner capacity first.
resp = client.chat.completions.create(
    model="trustedrouter/prometheus-1.0",
    messages=[{"role": "user", "content": task}],
)

# 1 customer request above
# = N model calls on your accelerators.
Today

Host our models.

Offer TrustedRouter composite models — Iris, Prometheus, Zeus, and custom models — to your own customers, under your brand. Prometheus 1.0 scores 69.2 on DRACO at ~$34 per run; the best single frontier model we tested scored 65.3 at ~$250.

You keep the compute margin on inner calls; we take a model royalty. Published evals, reproducible harness.

Today

Become an attested provider.

Providers are tiered by trust posture: confidential compute, zero-retention, standard. Attested capacity qualifies for trustedrouter/e2e and /zdr traffic that standard providers can never receive.

Routing is earned, not bought: we probe uptime, latency, and throughput continuously and publish what we measure.

Roadmap

Run TrustedOS.

The full attested stack on your infrastructure: self-hosted gateway and control plane, a model marketplace with owner payouts, and per-model kernel optimization (private beta).

In design with launch partners now. Labeled roadmap because it is one — nothing on this page pretends to ship before it ships.

Dynamo · vLLM · llm-d (keep them)TrustedOS (the layer above)
JobSchedules GPUs: batching, KV cache, disaggregationRuns the business: routing objectives, models, metering, trust
RoutingKV-aware, inside one clusterObjective-based across capacity: price, throughput, latency, privacy tier
MonetizationNone — bring your own billingPrepaid metering, per-key budgets, spend alerts, usage broadcast
DemandNoneRouted traffic plus composite models your customers can't get elsewhere
TrustNot their jobHardware attestation, fail-closed gateways, public evidence
HardwareNVIDIA-firstNeutral — GPU fleets today; custom silicon via routing, onboarding, and demand
RelationshipYou run itRuns alongside it — TrustedOS never replaces your serving engine
Verify the gatewaycurl + jq
# Nonce-bound attestation from the live gateway
NONCE=$(openssl rand -hex 16)
curl -s "https://api.trustedrouter.com/attestation?nonce=$NONCE" \
  | jq .

# Response includes:
#   eat_nonce     — your nonce, replay-protected
#   image_digest  — SHA-256 of the running container
#   pcrs          — platform measurements at boot
#
# Match the digest against the published build:
#   https://trust.trustedrouter.com
Why “Trusted” OS

The name is a claim we can prove.

In the security world, the “trusted OS” is the operating system inside a trusted execution environment. We picked the name on purpose: TLS terminates inside attested hardware — AMD SEV-SNP and Intel TDX today, with a confidential-computing path to H100/H200-class GPUs — and the gateway fails closed if the measurement doesn't match.

Your customers' compliance teams don't have to take your word for the stack you run. Don't trust the policy — verify the code.

Live attestation evidence How attestation works

Shipped vs roadmap

no vaporware

Shipped and verifiable in source today: objective routing (sort by price, throughput, or latency, with :nitro/:floor shortcuts and uptime-aware fallbacks) · privacy-tier routing (zdr, e2e, eu) · composite and custom models · prepaid microdollar metering with per-key budget modes · BYOK with envelope encryption · usage broadcast to PostHog/OTLP · multi-region attested gateways on two clouds.

Roadmap, in design with launch partners: self-hosted TrustedOS on your infrastructure · marketplace with model-owner payouts · per-model kernel optimization (private beta — benchmarks will publish before claims do).

The control plane and gateway are source-available (BUSL-1.1): read, build, and verify the exact code — the hash you compute is the hash the enclave reports. Production deployment runs under a commercial license.

Working with us

The boring details that make this real.

How do we start? A design-partner pilot: pick one rung — host composite models, or qualify capacity for the attested tier — and we scope a proof-of-value in weeks, not quarters.

Commercial shape? A platform license on active accelerators plus a small share on marketplace-originated traffic only. Your direct traffic is yours — we take nothing on it. Model royalties apply when you resell our composite models.

White-label? Yes. Your brand, your customers, your billing relationship. We never own your customer, and there's no exclusivity in either direction.

Custom silicon? We won't pretend to write kernels for wafer-scale or dataflow architectures. For non-GPU fleets the pitch is routing, fast model onboarding, and demand — not kernels.

Who's behind this? The team building TrustedRouter — the attested LLM gateway with public status, live attestation, and published evals. Start at trust.trustedrouter.com and read the code.

Talk to us: licensing@trustedrouter.com

Questions

Isn't NVIDIA Dynamo already the 'inference OS'?

Keep Dynamo — and vLLM, SGLang, llm-d. They schedule GPUs inside your cluster: batching, KV cache, disaggregation. TrustedOS is the layer above: objective routing across capacity, composite models, metering, trust tiers, and demand. They compose; they don't compete.

We run custom silicon, not GPUs. Does this apply?

Yes — but differently. Wafer-scale and dataflow architectures have no CUDA-style kernels, so we don't pitch kernel optimization there. For non-GPU fleets TrustedOS brings objective routing, fast model onboarding, and composite-model demand that fans inner calls onto your capacity.

What's real today versus roadmap?

Shipped: objective routing (price/throughput/latency with fallbacks), privacy-tier routing (zdr/e2e/eu), composite and custom models, prepaid metering with per-key budgets, BYOK, and multi-region attested gateways on two clouds. Roadmap: self-hosted TrustedOS, marketplace payouts, and per-model kernels (private beta). The page labels each — we don't ship claims before code.

Is the code open?

Source-available under BUSL-1.1: anyone can read, build, and verify the exact code behind the attestation claims — the hash you compute is the hash the enclave reports. Production deployment runs under a commercial license from Lore Hex Corp.

How do we start?

Email licensing@trustedrouter.com. A design-partner pilot picks one rung — host composite models under your brand, or qualify capacity for the attested trust tier — and scopes a proof-of-value in weeks.

Sign in

Choose a sign in method.