# Tessera

> The Optimize Layer for LLM cost. Tessera is a thin proxy that lives in your application's LLM request path — we route to a cheaper model when quality holds, cache identical responses, compress prompts via LLMLingua-2 where safe, and batch where eligible. Every saved dollar is measured directly from proxy logs. Performance-aligned billing: you pay only on measured savings.

## About

Tessera is an AI cost economy practice based in Tallinn, Estonia. Founded 2026 by Yevheny Panin (banker, trader). We work with B2B teams whose LLM inference costs grow faster than their revenue. Vendor-neutral. No affiliate revenue from any LLM provider, gateway, or observability platform. Legal entity details on [Imprint](https://tesseraai.io/imprint).

## Architecture (the Optimize Layer)

Tessera is added to your stack via two HTTP headers and one config-line change — point your OpenAI / Anthropic / Google client at the Tessera proxy base URL and add an API key. Tessera then forwards each request upstream but first applies four moves:

1. **Auto-route**: when the request asks for a top-tier model and your golden-set eval confirms a cheaper model passes for that workload, Tessera routes to the cheaper model. Five percent of routed requests are canary-sampled against the original model so regressions are caught before the customer sees them.
2. **Auto-cache**: if the same system prompt + user prompt + parameters has been asked before within your cache TTL, Tessera returns the cached response without calling the provider. Sub-10ms latency, 100% savings on that request.
3. **Auto-compress**: when input is large and LLMLingua-2 compression preserves quality on your eval, Tessera sends a tighter prompt upstream. 2-3× compression on retrieval-heavy workloads with negligible quality loss.
4. **Auto-batch**: when a workload is tagged batch-eligible, Tessera queues for up to 60 seconds, fires as a batch, returns when ready. OpenAI and Anthropic both offer 50% batch discount.

## Engagement structure — v3.5 LOCKED 2026-05-14

Two tiers + prepaid balance billing (Claude API style) + universal 7-day free trial. Pricing v3.5 retires v3.4's Founding Pilot $5,000 buffer in favour of two cleaner mechanics: a universal trial that benefits every Annual signup, and a permanent 25% rate lock reserved for the first 5 Founding Pilots regardless of future pricing changes.

- [**Annual**](https://tesseraai.io/#economics): **25% of measured savings**. Billed in real time against a **prepaid account balance** funded via Stripe Checkout (minimum $100) or invoice on request. No contract review for activation, no floor, no retainer. If balance reaches zero, the proxy auto-pauses to passthrough mode (no fees accrue while paused) — top up to resume. Quality preservation guaranteed at 0.90 by canary; three-day breach triggers auto-disable plus 10% fee credit applied to your balance.
- [**Enterprise**](https://tesseraai.io/#economics): For workloads above **$500,000 per month** in measured savings. **15% of measured savings**. Invoice billing (NET-30/45/60). Dedicated infrastructure, custom SLO, senior partner contact, custom contract.
- [**Free trial (universal)**](https://tesseraai.io/#apply): Every Annual signup gets **7 days fee-free** starting on the first proxy request. Tessera shadow-tracks savings normally during the window but accrues zero fee — you see exactly what Tessera would have charged in the dashboard before deciding to continue. After day seven the standard Annual rate (25%) applies on measured savings. Empty balance auto-pauses optimisation; traffic still passes through to your provider.
- [**Founding Pilot rate lock**](https://tesseraai.io/#apply): First five Annual activations get their **25% rate locked-in permanently**. If Tessera raises Annual pricing in the future (e.g., to 30% or 35%), the Founding Pilots stay at 25% forever — across plan changes, cohort closure, contract renewals. The lock survives ownership and pricing-policy revisions; it is recorded in the contract addendum at signup. Real economic value, compounds with engagement length, no upfront credit theater.

### Billing cadence (Claude API style)

- **Real-time fee accrual** per request via proxy logs
- **Daily roll-up debit** from balance (one transaction per day per client)
- **Email alerts** at 80% / 50% / 10% / 0% of last top-up amount — informational, not nagging
- **Monthly invoice PDF** auto-generated for tax/accounting records
- **Auto-pause at zero balance** — proxy continues passthrough; no fees accrue
- **Auto-resume on top-up** via Stripe webhook — no manual intervention

## Measurement

- [**Proxy-log measurement**](https://tesseraai.io/#mechanics): Savings are measured at request granularity from Tessera's own logs, not inferred from your billing CSV after the fact. For each request that gets routed, cached, compressed, or batched, Tessera records the counterfactual provider cost and the actual incurred cost. Aggregate Ongoing Savings = sum of (counterfactual − actual) across in-scope workloads.
- [**Monthly Joint Reading**](https://tesseraai.io/#evidence): Issued at the close of each month. Per-workload baseline. Measured Ongoing Savings. Performance Fee computation trace. Compounding savings chart. Same typeset register as the contract.
- [**Sample Acme Reading**](https://ledger.tesseraai.io/r/D6H56ncusgZEbXZ4kCyyeGmLCEHs3zrY): Anonymised reading in full. The artefact a Tessera client shares with their CFO each month.

## Compliance + safety

- **Client pause control**: every client has an always-available kill-switch in the operator dashboard (account-wide and per-workload). When engaged, the proxy bypasses route + cache + compress + batch and forwards requests as pure passthrough. Performance Fee does not accrue on paused traffic. Pause is reversible at any time without notice. Tessera does not work uncontrolled in your stack.
- **Compliance gate**: workloads tagged regulated (HIPAA, PCI-DSS, SOC 2 in-scope) NEVER auto-route. Code-level gate, not policy.
- **Quality SLA**: quality_preservation ≥ 0.90 by canary. Three-day breach → auto-disable of routing + 10% fee credit.
- **Audit immutability**: every $ figure references the pricing_catalog snapshot version active when it was computed. Price changes mid-contract do not retroactively change math.
- **Data residency**: Confidential data stored primarily inside the European Economic Area under Estonian jurisdiction. No prompt content stored, only token-count + cost-metadata. Subprocessors (OpenAI/Anthropic/Google) governed by DPA Article 28 standard form.
- **SOC 2 commitment**: SOC 2 Type II audit committed within twelve months of the first Annual signup processing a full billing cycle, covering the Optimize Layer (Worker proxy, dashboard, billing). Target attestation Q1 2027. Operating under Type II preparation controls today. Interim controls evidence available under NDA — write to contact@tesseraai.io.

## Differentiation

- [**Why not Helicone / Portkey / Langfuse**](https://tesseraai.io/#mechanics): Helicone, Portkey, Langfuse are observability platforms — they trace requests and show dashboards. Tessera is an optimize layer — we sit in the request path and actively route, cache, compress, batch on every call. We bill on measured savings from our own logs, not on a SaaS seat. You can keep your existing observability tool — Tessera imports its traces.
- [**Why not a generic AI consultancy**](https://tesseraai.io/#economics): Generic advisors charge fixed retainer regardless of outcome. Tessera bills only on measured savings, recorded by software we operate. There is no retainer.

## Trust + compliance

- [Security posture](https://tesseraai.io/security): Subprocessors, EEA-primary data residency, audit immutability, SECURITY DEFINER RLS isolation
- [Data Processing Agreement](https://tesseraai.io/legal/dpa): GDPR Article 28 standard form
- [Privacy Policy](https://tesseraai.io/privacy)
- [Terms of Service](https://tesseraai.io/terms)
- [Imprint](https://tesseraai.io/imprint): Fintechagency OÜ corporate details

## Company

- [About the practice](https://tesseraai.io/about): Founder, structure, what we don't do (no code modification, no affiliate revenue, no per-seat pricing)
- [Not for you if](https://tesseraai.io/not-for-you): Honest disqualification — when Tessera is wrong fit
- Contact: contact@tesseraai.io
- [Live dashboard demo](https://tesseraai.io/demo): Synthetic data preview of what the Tessera dashboard shows on Day 14 of a Founding Pilot — three of seven tabs (Overview, Optimize, Audit)

## Optional

- [Changelog](https://tesseraai.io/changelog): What shipped, when
- [Status](https://tesseraai.io/status): Practice operational signals
