HR & Payroll Document Automation
HR document families we automate
Payslip ingestion and archive
Monthly payslip PDFs and scanned legacy archives are normalized into one extraction schema — gross, net, AHV/ALV/NBU deductions, pension contributions, bonus lines, year-to-date totals. Swiss-German payroll-provider templates and bilingual EN/DE/FR payslips run through the same pipeline. Output writes back into the HRIS payroll module or a structured archive.
Employment contract clause extraction
Contracts arrive as signed scans, digital PDFs and Word exports. The pipeline classifies the contract type — permanent, fixed-term, apprenticeship, internship, freelance — then extracts the clauses HR actually queries: start date, probation, notice period, salary band, working hours, non-compete, holiday entitlement. Field-level provenance points back to the contract page.
Work certificate (Arbeitszeugnis) structuring
Swiss work certificates carry weight long after employees leave. The pipeline extracts employer details, role and responsibilities, employment period and the structured evaluation passages — and stores them against the personnel record so HR can answer reference requests without re-reading the PDF.
Sick notes and AHV/IV correspondence
Sick notes (Arztzeugnis) and AHV/IV statements arrive on paper from doctors, cantonal offices and insurers. The pipeline classifies the document, extracts the absence window or benefit reference, and routes it to the right HR or payroll role for action — without forcing the HR team to retype dates and case numbers.
Expense receipts and travel claims
Expense receipts (Spesenbelege) — restaurants, taxis, hotels, fuel, foreign-currency bills — run through the two-pass Mistral OCR pass plus a typed extraction step: merchant, date, amount, VAT, category. Submissions land in the approver queue with the original image attached for HITL spot-check before posting to payroll.
New HR document families as configuration
Work permits, training records, performance reviews, parental-leave forms, bonus letters — each new family is a configuration sprint, not a code release. HR operations defines the Category, FieldBlock and reviewer role; the pipeline picks the new family up at the next inbound document.
How we deliver HR document automation
HR document inventory and target HRIS mapping
We walk through the inbound HR mailbox and the scanned archive: which families arrive, in what volume, from which channels (provider portals, postal scans, email PDFs, employee uploads). We map each family to its target field in your HRIS or payroll system and lock the extraction schema before any code ships.
Pilot family in production
One HR family goes live end-to-end — usually payslip ingestion or work certificates, whichever has the highest manual cost today. Ingestion, two-pass OCR, classification, JSON-mode extraction, HITL review surface, HRIS write-out. The pilot runs on your real documents, not a sandbox sample.
Role-scoped HITL for HR and payroll
HR-sensitive fields and payroll-sensitive fields rarely belong to the same reviewer. We bind the review surface to your roles — HR generalist sees the contract and certificate fields, payroll specialist sees the wage and deduction lines, supervisors see only what they need to approve. Every edit is audited and tied to the source document.
Scale across the HR cabinet
Once the pilot family is stable, new families come online as configuration: training records, work permits, parental-leave forms, performance reviews. The pipeline inherits classification, extraction and HITL behaviour from the schema — no developer release for each new document type.
We walk through the inbound HR mailbox and the scanned archive: which families arrive, in what volume, from which channels (provider portals, postal scans, email PDFs, employee uploads). We map each family to its target field in your HRIS or payroll system and lock the extraction schema before any code ships.
One HR family goes live end-to-end — usually payslip ingestion or work certificates, whichever has the highest manual cost today. Ingestion, two-pass OCR, classification, JSON-mode extraction, HITL review surface, HRIS write-out. The pilot runs on your real documents, not a sandbox sample.
HR-sensitive fields and payroll-sensitive fields rarely belong to the same reviewer. We bind the review surface to your roles — HR generalist sees the contract and certificate fields, payroll specialist sees the wage and deduction lines, supervisors see only what they need to approve. Every edit is audited and tied to the source document.
Once the pilot family is stable, new families come online as configuration: training records, work permits, parental-leave forms, performance reviews. The pipeline inherits classification, extraction and HITL behaviour from the schema — no developer release for each new document type.
Why our engine fits Swiss HR
Swiss-German tolerance baked in
Swiss-German wording, cantonal provider templates and dialect-flavoured Belege are the norm. The OCR layer runs two Mistral passes — raw, then structured — and prompts treat Swiss-German as DE. French and Italian extend through the prompt registry.
Role-scoped HITL splits HR, payroll and supervisor
HR documents carry fields that should not share a queue. Wage and deduction lines belong to payroll. Contract clauses and certificate text belong to HR. Absences and sign-offs belong to the line manager. The review surface binds to your roles, so payroll never sees a sealed work certificate and HR never approves a wage correction. Every field-level edit is audited.
New HR families as configuration
The HR cabinet keeps growing — a parental-leave form one quarter, a training certificate the next. Category, FieldBlock and Field are admin-defined models in the same engine behind the S001 hub. HR operations adds the family, defines fields and reviewer role, and the pipeline picks it up at the next inbound document.
Swiss and EU data residency for HR content
HR data sits under the Schweizer Datenschutzgesetz for Swiss employees and GDPR for EU residents — often both at once. We deploy on Swiss, EU or on-premises infrastructure depending on the data classification. For HR teams that cannot send content to public model endpoints, the Apertus track keeps inference on Swiss servers.
Frequently Asked Questions
Swiss-German payslip terms are treated as DE at both the OCR and extraction layers. The prompt registry carries the cantonal payroll-provider variants — Abacus, custom in-house templates, third-party providers — as separate prompts that share one extraction schema. The HR team does not pre-sort documents by provider.
Start date, probation, notice period, salary band, working hours, holiday entitlement, non-compete, contract type and the parties to the contract are the standard schema. Custom clauses — bonus formulas, role-specific addenda — extend through configuration. Field-level provenance always points back to the contract page so HR can verify before the data posts to the HRIS.
The pipeline writes through a configurable connector layer. Abacus, SAP SuccessFactors, Workday and BambooHR are the common targets. For custom HRIS and in-house payroll engines, we map the extraction schema onto your API or database fields during discovery. The model and OCR layers remain swappable behind that connector.
Swiss-resident infrastructure, EU-resident infrastructure or on-premises — chosen on the data classification, not as a single default. Swiss Datenschutzgesetz and GDPR posture sit in the deployment template. For HR teams that cannot send content to public model endpoints, the Apertus sovereign-LLM track keeps inference on Swiss-hosted servers.
Each FieldBlock is bound to a reviewer role. Wage and deduction blocks route to payroll. Contract and certificate text blocks route to HR. Supervisor sign-off blocks route to the line manager. A payroll specialist never sees a sealed work-certificate evaluation and an HR generalist never approves a wage correction. Audit trail is field-level and tied to the user.
Yes. Digital PDFs skip the OCR pass and go straight to extraction. Scanned PDFs and image attachments run through the two-pass Mistral OCR — raw first, then structured — and then through the same extraction schema. Mixed inboxes with both formats are the default assumption, not a special case.
HR-sensitive content is held at the residency tier you choose. Access is role-scoped at the FieldBlock level, not just at the document level. Every read and edit is logged. Deletion and retention follow your HR retention schedule and the legal bases under DSG and GDPR. We do not hold a Swiss labour-law or compliance certification — the architecture is built to fit your existing posture.
Configuration, not a developer release. HR operations defines the Category, FieldBlock, required fields and the reviewer role in the admin model. The pipeline picks the new family up at the next inbound document. The pattern is the same one that powers the S001 hub engine across other industries.
About SAPIENTROQ
Interested in a solution?
We are glad to show you various options without any obligation.

Roland Kurmann
CEO, SAPIENTROQ