WEITA AG
Weita Supplier Onboarding Workflow: Wholesale Digital Transformation
About the Project
How SAPIENTROQ built a Laravel and Next.js workflow that turns PDFs, emails, images and spreadsheets into compliant PIM products for Weita AG, with Mistral OCR, OpenAI JSON-mode extraction and a database-backed prompt registry.
Used technologies:
Country
Switzerland
Industry
Wholesale & Distribution
Development Hours
1500+
Team Size
5-7
The challenge: heterogeneous supplier input meets regulated labels
Weita AG is a Swiss wholesaler in healthcare, cleaning, safety, hygiene and non-food. Supplier input arrives as PDFs, pasted email threads, product images and Excel spreadsheets, and one product often spans all four formats at once. The output has to be a fully attributed, compliant PIM product whose category-required labels are demonstrably satisfied — a real swiss wholesale pim implementation problem, not a content-tagging exercise.
The previous flow leaned on email and manual entry across Content Management, Marketing and Support. Provenance was thin, compliance evidence lived in a spreadsheet next to the system of record, and a missed label only surfaced downstream. The brief was to replace that with a deterministic master data management workflow that an auditor can read.
Why a state machine, not an ad-hoc job queue
A job queue tells you what ran. A state machine tells you where a product is, who owns the next move and which transitions are legal from here. For a regulated wholesale digital transformation that distinction is the audit trail.
Every product walks an explicit state list: Initiation, CategorizationProcessing, CategorizationReady, ExtractionProcessing, CM, Marketing, Support, Finalized — with dedicated Failed states for both AI phases. Each state knows its legal next transitions and the role that owns it. Nothing slips sideways because the workflow refuses any transition that is not on the list.
Architecture: Laravel 12 modular monolith, Next.js 13, Mistral OCR + OpenAI JSON-mode
Two deployables. The backend is a Laravel 12 modular monolith with nine business modules — Auth, User, Category, Attribute, Label, DocumentType, Workflow, AI and Products — on PostgreSQL 18, Redis 7 and Horizon for queue supervisors. The frontend is Next.js 13 App Router with PrimeReact and i18next.
AI never blocks an HTTP request. Each supplier file runs through async jobs — ProcessWorkflowFileWithMistralJob, RunWorkflowCategorizationJob, RunWorkflowExtractionJob — with Mistral OCR called in two passes per file (raw plus structured) and OpenAI in JSON mode for both categorization and extraction. Prompts live in a database-backed prompt registry keyed by a PromptKey enum, so a prompt change ships without a redeploy. Sanctum handles auth, Flysystem talks to three S3 disks for supplier docs, product docs and product images. Local development runs on Laravel Sail.
Role handoffs and the audit trail
Compliant document automation is not a matter of one good model call — it is a chain of human decisions logged against the same record. The workflow enforces who can do what at each step.
- Content Manager owns the CM state: confirms category, fills attributes the AI did not resolve.
- Marketing owns the Marketing state: tone, multilingual descriptions, channel-ready copy.
- Support owns the Support state: final checks before publication.
- Finalized is terminal: the product is live in the PIM and the workflow row is the audit record.
Both AI phases have their own Failed states, with retry-ready details columns instead of corrupt records. An auditor reading the workflow row sees the full path, the role on every transition and the AI evidence pack that informed each call.
Live compliance engine for product labels
Swiss wholesale categories carry required labels — safety pictograms, hazard statements, hygiene markings, regulatory references. The Label module declares which labels each category requires; the compliance engine derives live satisfaction from the actual product attributes for that category.
If a category demands a label and the supplier pack does not evidence it, the product cannot advance to Finalized. The AI context builder also assembles the evidence pack for each model call, so categorization and extraction always see the same attribute and label context a reviewer would. This is what makes the compliance automation defensible rather than decorative.
Delivered value, and what this means for your team
What changed: ad-hoc email and manual entry replaced by a deterministic, auditable workflow; AI-assisted categorization and attribute extraction with prompt-registry governance; enforced role ownership at every step; live compliance evidence tracked against category-required labels; AI failures contained in dedicated Failed states instead of polluting good records.
This is the second SAPIENTROQ engagement for Weita. We previously built the API-gateway Interim-CIO engagement for the same client; this workflow project is a separate, later scope. If you are a Head of Digitalisation or Head of Master Data at a Swiss wholesaler and you recognise the supplier-onboarding pain, read the FAQ below or book a discovery call to plan your supplier onboarding workflow.
Solutions
Solutions in this engagement
- Workflow state machine with explicit states and legal transitions.
- Mistral OCR two-pass plus OpenAI JSON-mode extraction in async jobs.
- Database-backed prompt registry keyed by a PromptKey enum.
- Role-gated CM, Marketing and Support states with Failed branches.
- AI supplier onboarding wired into a deterministic state machine.
Delivered Value
- Deterministic, auditable supplier onboarding end-to-end.
- Live compliance derivation against category-required labels.
- Prompt changes ship without a redeploy.
- AI failures contained in retry-ready Failed states.
Frequently asked about this engagement
Each supplier file lands on an S3 disk and triggers an async pipeline: a Mistral OCR pass extracts raw text, a second pass returns a structured view, then OpenAI in JSON mode runs categorization and attribute extraction. The result flows into the workflow record, which the Content Manager, Marketing and Support roles then complete inside the state machine.
A queue answers what ran; a state machine answers where the product is, who owns the next move and which transitions are legal. With explicit Initiation, Categorization, Extraction, CM, Marketing, Support and Finalized states, plus dedicated Failed branches for both AI phases, the workflow row is the audit trail. Nothing advances by accident — only by a legal transition.
Prompts live in a database-backed prompt registry keyed by a PromptKey enum. The AI module resolves the active prompt by key at call time, so a change in the registry takes effect on the next job run. Versioning is preserved on the row, which lets an auditor trace which prompt was active when a given categorization or extraction happened.
Each state declares the role that owns it and the next legal states. Content Manager confirms category and missing attributes, Marketing completes channel-ready copy, Support runs final checks before Finalized. The workflow refuses any transition outside the legal set, and every move is logged against the row — so the audit trail is the system of record, not a spreadsheet next to it.
The Label module declares required labels per category. The compliance engine derives live satisfaction from the product attributes attached during extraction and review. If a required label is unevidenced, the product is blocked from Finalized, and the AI context builder makes sure each model call sees the same label and attribute context a human reviewer would.
About SAPIENTROQ
Interested in a solution?
We are glad to show you various options without any obligation.

Roland Kurmann
CEO, SAPIENTROQ


