OpenAI to Apertus Migration
What an Apertus migration delivers
Audit of the existing stack
We walk through the production OpenAI or Anthropic integration end to end — prompts, system messages, tool-use schemas, retry behaviour, rate-limit handling, eval sets, the calling code path and the cost line. The audit produces a portability map for every prompt and a documented baseline that the migration is later measured against.
Eval set as migration contract
Before a single model is swapped, the eval set is rebuilt and frozen. Real production inputs, gold-standard outputs, edge cases and regression items become the contract. Two-thirds of the work in a migration is engineering, one-third is evaluation — and the evaluation is what tells the cutover when it is allowed to happen.
Prompt rewrite for Apertus
Prompts written against a specific frontier model do not always survive a swap. We rewrite system messages, few-shot examples and tool schemas for Apertus, prompt by prompt, against the frozen eval set. Where a prompt cannot reach the bar by rewrite alone, we flag it for a domain fine-tune behind the new prompt.
Side-by-side quality benchmark
Apertus and the incumbent run the same inputs in parallel. Scoring covers task quality against the frozen rubric, p50 and p95 latency, and cost per task on Swiss-resident hosting. The result is a published cost-per-task table that lets your team decide on real numbers, not slide-deck math or vendor brochure claims.
Shadow-traffic phase pre-launch
Once benchmarks clear the bar, Apertus runs alongside the incumbent on real production requests, responses scored but not served. The shadow phase exposes drift the eval set could not catch and lets us tune routing, caching and tool fallbacks against live load before a single end-user sees the new model in their UI.
Controlled cutover with rollback
Cutover is a defined event, not a flag flip. A canary slice takes the first live traffic with the eval gate enforced on every response. Full switch follows only when the canary stays green, and rollback to the incumbent is one configuration step the entire time. Freeze window and gate criteria are documented.
How a migration runs
Audit the integration
We map the production OpenAI or Anthropic integration — prompts, tool schemas, retries, eval sets, regression tests and the cost line — and deliver a portability classification per prompt before any swap is scheduled.
Freeze the eval set
The eval set is rebuilt as the migration contract: real production inputs, gold outputs, edge cases and regression items. It is signed off and frozen before any prompt is touched, and it decides when cutover is allowed.
Rewrite prompts
System messages, few-shot examples and tool schemas are rewritten prompt by prompt against the frozen eval set. Prompts that cannot clear the bar by rewrite alone are marked for a domain fine-tune behind the new prompt.
Run the benchmark
Apertus and the incumbent score the same inputs side by side on quality, p50 and p95 latency and cost per task. The published table is the engagement deliverable; your CTO and CFO decide on real numbers from it.
Shadow live traffic
Apertus runs alongside the incumbent on real requests, scored but not served. The shadow phase catches drift the eval set missed and lets us tune routing, caching and tool fallbacks before any end-user touches the new model.
Canary and rollback
A canary slice takes the first live traffic with the eval gate enforced on every response. Full switch follows only when the canary stays green; rollback to the incumbent stays one step away through the freeze window.
We map the production OpenAI or Anthropic integration — prompts, tool schemas, retries, eval sets, regression tests and the cost line — and deliver a portability classification per prompt before any swap is scheduled.
The eval set is rebuilt as the migration contract: real production inputs, gold outputs, edge cases and regression items. It is signed off and frozen before any prompt is touched, and it decides when cutover is allowed.
System messages, few-shot examples and tool schemas are rewritten prompt by prompt against the frozen eval set. Prompts that cannot clear the bar by rewrite alone are marked for a domain fine-tune behind the new prompt.
Apertus and the incumbent score the same inputs side by side on quality, p50 and p95 latency and cost per task. The published table is the engagement deliverable; your CTO and CFO decide on real numbers from it.
Apertus runs alongside the incumbent on real requests, scored but not served. The shadow phase catches drift the eval set missed and lets us tune routing, caching and tool fallbacks before any end-user touches the new model.
A canary slice takes the first live traffic with the eval gate enforced on every response. Full switch follows only when the canary stays green; rollback to the incumbent stays one step away through the freeze window.
Why the eval set is the contract
The eval set is the contract, not the model
Most migrations fail because the team treats the model swap as the deliverable and the evaluation as paperwork. We flip it. The frozen eval set — real production inputs, gold-standard outputs, edge cases and regression items — is the only thing that decides when cutover is allowed. Apertus, the incumbent and any future model are measured against the same rubric, signed off before a prompt is touched. The paid Apertus evaluation POC builds this contract before a full migration.
Two-thirds engineering, one-third evaluation
A model migration is two-thirds engineering and one-third evaluation. We rebuild the eval set first, freeze it as the contract, and only swap models once the regression suite says the new system meets the old one's quality bar. The split protects both sides — engineering has a real target, and evaluation has the budget to catch regression instead of rubber-stamping the swap.
Shadow, canary and full switch
Cutover is staged, not flipped. Shadow traffic runs Apertus on real requests with responses scored but not served, so we catch drift under live load. A canary slice then takes a percentage of traffic with the eval gate enforced on every response. Full switch follows only when the canary stays green, and rollback to the incumbent is one step away the entire time.
Where this connects across the Apertus track
The target inference side is handled by on-prem Apertus deployment or Swiss sovereign hosting. When prompts cannot clear the bar by rewrite, we add a domain fine-tune behind the new prompt. Discovery opens via the Apertus hub or AI consulting.
Frequently Asked Questions
The driver is sovereignty, supplier risk and cost — not quality envy. Apertus is open weights under Apache 2.0 and runs on Swiss-resident infrastructure, so inference, prompts and logs stay in the country. That answers a board or regulator mandate a US API cannot.
The audit walks the integration end to end: prompts, system messages, tool-use schemas, retry behaviour, eval sets, regression tests and the calling code path. We rebuild the eval set as the migration contract before any model swap is scheduled.
Not every prompt migrates cleanly. Short structured prompts with tool calls usually port with light tuning. Long chain-of-thought prompts written for a specific frontier model often need a full rewrite for Apertus, and domain-heavy tasks may need a fine-tune behind the prompt.
We freeze the eval set as the contract before any model swap. Side-by-side runs compare Apertus and the incumbent on the same inputs, scored on quality, latency and cost. The regression suite must clear the old system's quality bar before the cutover gate opens.
We do not publish a headline number — the delta depends on prompt length, tool-use depth, throughput pattern and hosting. The audit produces a cost-per-task table comparing the incumbent API line against Apertus on Swiss-resident hosting at your real workload.
Default path is shadow traffic first: Apertus runs alongside the incumbent on real requests, responses scored but not served. Once the regression suite holds, a canary slice takes live traffic. Full switch follows only after the canary stays green.
About SAPIENTROQ
Interested in a solution?
We are glad to show you various options without any obligation.

Roland Kurmann
CEO, SAPIENTROQ