Green decoration

OpenAI to Apertus Migration

Move a production OpenAI or Anthropic integration onto Apertus, the Swiss open-weights LLM, without losing quality. Integration audit, eval-set rebuild, prompt rewrite, benchmark and staged cutover — by a Swiss team.

What an Apertus migration delivers

Audit of the existing stack

We walk through the production OpenAI or Anthropic integration end to end — prompts, system messages, tool-use schemas, retry behaviour, rate-limit handling, eval sets, the calling code path and the cost line. The audit produces a portability map for every prompt and a documented baseline that the migration is later measured against.

Eval set as migration contract

Before a single model is swapped, the eval set is rebuilt and frozen. Real production inputs, gold-standard outputs, edge cases and regression items become the contract. Two-thirds of the work in a migration is engineering, one-third is evaluation — and the evaluation is what tells the cutover when it is allowed to happen.

Prompt rewrite for Apertus

Prompts written against a specific frontier model do not always survive a swap. We rewrite system messages, few-shot examples and tool schemas for Apertus, prompt by prompt, against the frozen eval set. Where a prompt cannot reach the bar by rewrite alone, we flag it for a domain fine-tune behind the new prompt.

Side-by-side quality benchmark

Apertus and the incumbent run the same inputs in parallel. Scoring covers task quality against the frozen rubric, p50 and p95 latency, and cost per task on Swiss-resident hosting. The result is a published cost-per-task table that lets your team decide on real numbers, not slide-deck math or vendor brochure claims.

Shadow-traffic phase pre-launch

Once benchmarks clear the bar, Apertus runs alongside the incumbent on real production requests, responses scored but not served. The shadow phase exposes drift the eval set could not catch and lets us tune routing, caching and tool fallbacks against live load before a single end-user sees the new model in their UI.

Controlled cutover with rollback

Cutover is a defined event, not a flag flip. A canary slice takes the first live traffic with the eval gate enforced on every response. Full switch follows only when the canary stays green, and rollback to the incumbent is one configuration step the entire time. Freeze window and gate criteria are documented.

How a migration runs

We map the production OpenAI or Anthropic integration — prompts, tool schemas, retries, eval sets, regression tests and the cost line — and deliver a portability classification per prompt before any swap is scheduled.

The eval set is rebuilt as the migration contract: real production inputs, gold outputs, edge cases and regression items. It is signed off and frozen before any prompt is touched, and it decides when cutover is allowed.

System messages, few-shot examples and tool schemas are rewritten prompt by prompt against the frozen eval set. Prompts that cannot clear the bar by rewrite alone are marked for a domain fine-tune behind the new prompt.

Apertus and the incumbent score the same inputs side by side on quality, p50 and p95 latency and cost per task. The published table is the engagement deliverable; your CTO and CFO decide on real numbers from it.

Apertus runs alongside the incumbent on real requests, scored but not served. The shadow phase catches drift the eval set missed and lets us tune routing, caching and tool fallbacks before any end-user touches the new model.

A canary slice takes the first live traffic with the eval gate enforced on every response. Full switch follows only when the canary stays green; rollback to the incumbent stays one step away through the freeze window.

Why the eval set is the contract

The eval set is the contract, not the model

Most migrations fail because the team treats the model swap as the deliverable and the evaluation as paperwork. We flip it. The frozen eval set — real production inputs, gold-standard outputs, edge cases and regression items — is the only thing that decides when cutover is allowed. Apertus, the incumbent and any future model are measured against the same rubric, signed off before a prompt is touched. The paid Apertus evaluation POC builds this contract before a full migration.

Two-thirds engineering, one-third evaluation

A model migration is two-thirds engineering and one-third evaluation. We rebuild the eval set first, freeze it as the contract, and only swap models once the regression suite says the new system meets the old one's quality bar. The split protects both sides — engineering has a real target, and evaluation has the budget to catch regression instead of rubber-stamping the swap.

Shadow, canary and full switch

Cutover is staged, not flipped. Shadow traffic runs Apertus on real requests with responses scored but not served, so we catch drift under live load. A canary slice then takes a percentage of traffic with the eval gate enforced on every response. Full switch follows only when the canary stays green, and rollback to the incumbent is one step away the entire time.

Where this connects across the Apertus track

The target inference side is handled by on-prem Apertus deployment or Swiss sovereign hosting. When prompts cannot clear the bar by rewrite, we add a domain fine-tune behind the new prompt. Discovery opens via the Apertus hub or AI consulting.

Frequently Asked Questions

  • The driver is sovereignty, supplier risk and cost — not quality envy. Apertus is open weights under Apache 2.0 and runs on Swiss-resident infrastructure, so inference, prompts and logs stay in the country. That answers a board or regulator mandate a US API cannot.

About SAPIENTROQdecoration

ai avatar

Hey there! I’m your AI assistant developed by SAPIENTROQ. I am a language model connected to a RAG database that contains information about the company. If you need insights on AI solutions, real use cases, or how AI can boost your business, please feel free to ask in any language you prefer.

Choose an option

Hey! I am AI agent developed by SAPIENTROQ 🤖
Decoration
Decoration

Interested in a solution?

We are glad to show you various options without any obligation.

Roland Kurmann

Roland Kurmann

CEO, SAPIENTROQ

Book a call

Decoration