Not for you, if.
Five clean disqualifications · No discovery-call ceremony required
Tessera is a small practice. The Performance Fee model only works when the engagement is sized correctly on both sides. Honest disqualification up front saves your time and ours. If any of the following describes your operation, the most useful thing we can do is point you somewhere sharper.
Your monthly LLM inference spend is below five hundred dollars.
With pricing v3.4 (prepaid balance, $100 minimum top-up, no floor), there is technically no spend threshold — anyone can fund a balance and route traffic through Tessera. The 25% × zero-savings = zero-fee math means there's no downside for either side.
But at very low spend the absolute dollars we can extract are small, and you'll move the cost line faster yourself with vendor-side prompt caching plus a couple of structured-output migrations than by adding a routing layer. If spending is climbing fast, apply anyway — the universal 7-day free trial covers your first week regardless of spend bracket.
Tools to look at first: provider documentation on prompt caching (Anthropic, OpenAI, Google), the free tier of an observability gateway (Helicone, Langfuse, Portkey), and a one-day internal audit of which prompts are unnecessarily long.
You're already inside a vendor enterprise team.
If you have an active named-account relationship with the OpenAI or Anthropic enterprise team — quarterly business reviews, dedicated SE, custom pricing — that team is already doing the structural work of an external advisor. Their incentives are not perfectly aligned with yours, but the data access and engineering depth they bring is something we cannot replicate. Use them first.
Where Tessera adds value alongside an enterprise team is on the second-order questions: model selection across multiple providers in the same workload, gateway architecture, prompt-cost decomposition by feature. If those are your unsolved problems and the vendor team has plateaued, we can talk.
You want a one-time audit and then to be left alone.
Tessera's economic model is the Optimize Layer running continuously in your request path, measuring savings month over month. There is no one-time-audit deliverable. The proxy only earns its keep when you keep traffic flowing through it.
For a one-time AI-spend audit you want an independent consultancy on a fixed-fee engagement. The output of that engagement is a deck and a list of recommendations; what happens after is your problem. That is a legitimate product shape — it is just not ours.
All your workloads carry regulated data (HIPAA, PCI-DSS, SOC 2 in-scope).
The Tessera compliance gate is code-level: workloads tagged regulated never get auto-routed, auto-cached, auto-compressed, or auto-batched. We still measure and report on them, and the proxy still produces a Monthly Reading — but the four cost moves stay dormant. If one hundred percent of your traffic is regulated, the Optimize Layer's economic surface area is zero.
The hybrid case — some regulated, some not — usually works. The non-regulated portion carries the savings; the regulated portion stays passive. If you're not sure whether your stack falls inside that boundary, sign up anyway and route a small slice through Tessera. If the proxy measures zero savings on the non-regulated portion, no fee accrues against your balance.
Your AI workloads are stable and already heavily optimised.
If your team has already shipped prompt caching, routing across model tiers, request batching, and a cost-attribution dashboard on every workload — and the latest gain you measured was below five per cent — Tessera is unlikely to find structural savings large enough to clear the operational overhead of routing through a proxy.
The Optimize Layer is best suited to operations that have grown AI inference inside the product faster than the team has had time to engineer the cost side. If yours is already a well-engineered cost surface, an internal quarterly review will outperform an external substrate proxy.
If none of the above applies
You're probably in the cohort Tessera is built for. Two concrete shapes:
- · Series A-B AI-native SaaS, $20k-$200k/mo on LLM APIs, gross margin under pressure. CTO or Head of AI can change a base-URL in 30 minutes. No procurement, no security questionnaire, decision in 72 hours.
- · Series B-D scale-up adding AI features, $50k-$500k/mo, AI Platform Lead owns the budget. SOC 2 commitment statement unblocks light security review. Decision in 2-4 weeks.
Start with the apply form on the landing — eight short questions, response within five business days. If we are not the right fit, we will say so directly.