Seracade: cut AI agent inference cost, one model per step

[ seracade ] login »

read the FAQ » or login via email »

BYOK · ONE URL CHANGE · 15% OF THE GAP · UP OR DOWN · FREE UNTIL $500 SAVED

Pricing, concretely

$20/M → $2/M

Example: an expensive step routed down to a cheaper model that still clears your quality bar.

Price difference: $18 / M
Our fee · 15% of it: $2.70 / M
You pay · model + fee: $4.70 / M
You save vs $20 / M: $15.30 / M76% lower

Same 15% either direction, including routing a weak step up. Free until your routed savings reach $500 lifetime.

FAQ↓

WHY OPTIMIZE

Your agent calls one model for every step. Most steps do not need it.

Planning, extraction, summarization, code: each step has its own quality requirement, and most clear your bar on a model cheaper than your default. That gap is the spend, and it is widening. At the top of the range the frontier models your agents lean on are not getting cheaper, a global crunch in compute and power is lifting the floor under every token, and agents multiply one task into many calls. Seracade routes every call to the lowest-cost model that still clears your quality bar, measured on your own traffic. At the top, prices stopped falling. Your bill does not have to.

FAQ

>Do I keep my own API keys?

Yes. BYOK. You bring and keep your provider keys; Seracade never holds them.

>How is quality measured?

On your own traffic, against the quality bar you choose (90% by default). The lowest cost model that clears your bar gets the call.

>What does it cost?

15% of the price difference per routed call, in either direction. Free until your cumulative routed savings cross $500 lifetime.

>Will it swap a model on my agent without asking?

No. You see each substitution before it routes, switch routing on per task type, and pause any task in one click. Nothing changes silently.

>Does it add latency?

A few milliseconds at the edge to classify the step and look up the route. Measurement and scoring run off the critical path, not on your live call.

more at /faq