FAQ

Seracade is a drop-in routing proxy for AI agent teams. These are the questions teams ask before they point a base URL at us.

What is Seracade?

A drop-in routing proxy. Seracade classifies each call by task type, measures candidate models on your own traffic, and routes every call to the lowest cost model that clears your quality bar. Down when a lower-cost model passes, up when your default is underpowered. You change one base URL and keep your own keys.

How is this different from a plain routing layer?

A routing layer sends a call to whatever model you or your config names. Seracade adds the routing decision on top: it measures candidate models on your own traffic, per task type, and picks the one that clears your quality bar. The measurement is the difference.

What if a routed model is worse than my default?

Seracade routes a task down only where your own traffic shows a candidate clears your quality bar for that task type. Until your data is ready it uses internal benchmarks at the same bar; once your traffic produces a valid sample, your measured decision takes over. If nothing clears the bar, the task stays on your default. Every decision shows its evidence, and any task pauses in one click.

Do I have to change my code?

One base URL change. Point your existing SDK (OpenAI, Anthropic, Google, or xAI) at Seracade with your own key. Seracade classifies the call, substitutes the model when a cheaper one clears your bar, and translates the response back to your SDK's shape so your parsers, tool-call handlers, and retries keep working.

Which frameworks and providers work?

CrewAI, LangGraph, AutoGen, and custom frameworks all work, whatever native client they use. Native request shapes: OpenAI, Anthropic, Google, xAI. Substitution backend: any model on OpenRouter, which spans OpenAI, Anthropic, Google, xAI, Mistral, Meta, DeepSeek, Cohere, and others. For MCP environments like Claude Code, Seracade ships a native MCP server via npx seracade.

How does pricing work?

15% of the absolute price difference on each routed call, whether it routed down for cost or up for quality, because the value is the routing decision itself. You keep 85%. Free until your cumulative routing value reaches $500, then billing begins. Bring your own keys throughout.

Will my prompts or keys train your models?

No. Your keys stay in your environment. Your prompts and completions are used only to build your own routing table. Seracade does not resell tokens and does not pool customer traffic into any shared training or scoring set.

Can I hold a minimum quality bar on sensitive tasks?

Yes, with Quality Gates. You set a rule per task type and minimum score; Seracade reports each task type's measured pass rate against the Gate with a 95% confidence interval and routes cheaper only where the evidence supports clearing it. Hold the line on legal, healthcare, or financial tasks while cost routing on summarization, extraction, and classification.

$ npx seracade setup