SOC 2 · RBAC · self-hostableSign in Open workspace

Agentic document digitization

Hand it to a process,not an API.

alma reads any document with a multi-agent pipeline — typed, handwritten, tables, maps, stamps, signatures — then verifies every field against cited evidence before it ships. Nothing is fabricated.

Hand alma a document

Watch the pipeline run →

Trusted by archivists, registries and enterprise data teams

text track

“Ezekiel A. Holder”

est 0.71

vision track

“Ezekiel Braithwaite”

est 0.69

both reads below threshold → escalated to a reviewer

grantor ="Ezekiel Braithwaite"· reviewer · ¶2 L3✓

amber + ✓ = a human confirmed it — not the machine

Verified, not generated

Drag the seam. Watch ink become a dataset you can trust.

Left, a 1911 conveyance in fading copperplate. Right, the same page after alma — every field boxed, scored, and linked to the exact pixels it came from.

Read & scored — 47 fields · 44 auto-confirmed (machine) · 3 to review

grantor ="Ezekiel A. Holder"· est 0.96· ¶2 L3parish ="St. Michael"· est 0.98parcel =Lot 7 · №1843/214· est 0.94consideration =£ 240 ?· est 0.71· review

teal = machine-read & scored · amber = a human confirmed it

every value → score + citation + bounding box

Raw scan — 1911 conveyance, mixed type + handwriting

Know all men by these Presents that I, Ezekiel A. Holder of the parish of Saint Michael, planter, in consideration of the sum of…

blur · foxing · faded ink · no structure

Read & scored — 47 fields · 44 auto-confirmed (machine) · 3 to review

grantor ="Ezekiel A. Holder"· est 0.96· ¶2 L3parish ="St. Michael"· est 0.98parcel =Lot 7 · №1843/214· est 0.94consideration =£ 240 ?· est 0.71· review

teal = machine-read & scored · amber = a human confirmed it

every value → score + citation + bounding box

The pipeline

Five agents. One orchestrated pass.

A document is handed to a process, not an API — detected, read on two tracks, reconciled, and shipped only with the evidence behind it.

text
vision
verified ✓ human
escalated

ship's manifest · 1907

detect
& segment

text track

surname ="Kowalczyk"· est 0.68

vision track

surname ="Kowalczyk"· est 0.71

verify · reconcile

surname =“Kowalczyk”· est 0.95

manifest p.3 · roll 1907-A

export

csvjsonAPI

escalate · ~3%

→ human✓

ship's manifest · 1907

detect
& segment

text track

surname ="Kowalczyk"· est 0.68

vision track

surname ="Kowalczyk"· est 0.71

verify · reconcile

surname =“Kowalczyk”· est 0.95

manifest p.3 · roll 1907-A

export

csvjsonAPI

escalate · ~3%

→ human✓

01
Detect & segment
Locate every field, table, stamp and margin note.
02
Dual-track read
Text and vision models read each field on their own.
03
Verify
Reconcile both reads into one scored, cited value.
04
Export
Confirmed values ship as csv, json or API.
05
Escalate
The doubtful ~3% goes to a human reviewer.

How it shows its work

Two readings. One value.

field · passenger.surname · ship's manifest, 1907

text track · OCR/HTRsurname ="Kowalczyk"· est 0.68

vision track · VLMsurname ="Kowalczyk"· est 0.71

machine-resolved

surname = “Kowalczyk”est 0.95

cited · manifest p.3 · passenger roll 1907-A

Neither track is trusted alone.

est = alma's estimate — only a human sign-off turns it amber + ✓

What's under the hood

Built to read what OCR gives up on.

table

map

stamp

sign

hand

redact

Any artifact, not just clean type

Handwriting, tables, maps, stamps, signatures, redactions — the things plain OCR drops.

birth.date ="1923-04-07"· est 0.93

A score on every field

Each value carries an estimate and a citation to the source pixels.

text 0.68

vision 0.71

Two tracks, cross-checked

A text track and a vision track — never one model's lone guess.

“Kraków”

gazetteer

matched · place index

Grounded in your vocabulary

Readings checked against your gazetteers, indexes and code lists.

97% auto3% → human

Only the residue reaches a person

Reviewers see the doubtful 3% — candidates and evidence pre-assembled.

csvxlsxxmljsonAPIMCP

Export anywhere, access anywhere

Excel, CSV, XML, JSON — plus an enterprise API and MCP, with RBAC on every key.

Any document, any century

Digitize ships' manifests

Centuries of formats, one process — built for the archive, not the demo.

Ship's manifest
surname ="Kowalczyk"· est 0.71
Estate ledger
amount ="£240 14s"· est 0.88
Birth register
born ="1923-04-07"· est 0.93
Cadastral map
parcel.id ="№1843/214"· est 0.90
Census return
occupation ="shipwright"· est 0.74
Medical intake
blood_type ="O+"· est 0.82
Conveyance deed
grantor ="Holder"· est 0.62
Ship's manifest
surname ="Kowalczyk"· est 0.71
Estate ledger
amount ="£240 14s"· est 0.88
Birth register
born ="1923-04-07"· est 0.93
Cadastral map
parcel.id ="№1843/214"· est 0.90
Census return
occupation ="shipwright"· est 0.74
Medical intake
blood_type ="O+"· est 0.82
Conveyance deed
grantor ="Holder"· est 0.62

text trackvision tracksent to a reviewer

The product, not a pitch

See it in the app.

Real screens from the verifying app — every value scored, evidenced, and yours to check.

01 / 04

The Verifying Room

Reviewers see the scan and every reading side by side.

scan ↔ readings, locked together
confirm, edit or reject in a keystroke
every field scored as you go

alma.intergen.app/review

verifying room

02 / 04

The review docket

Only the doubtful residue reaches a person.

sorted by what needs you most
deeds, manifests, ledgers, forms
the easy 97% never lands here

alma.intergen.app/queue

review docket

03 / 04

Knowledge grounding

Every reading checked against your own vocabulary.

names, places, parcels, codes
matches shown inline with each read
your corrections compound over time

alma.intergen.app/knowledge

knowledge layer

04 / 04

Per-field confidence

Each value carries a score, evidence, and a reason.

value · score · evidence · reason
export csv · json · xml · xlsx
RBAC scoped on every key

alma.intergen.app/console

dev console

alma's developer console: per-field results where each value is returned with a confidence score, an evidence citation, and a reason.

A model you own.

Every correction your reviewers make is captured as training data. alma fine-tunes and distills a model specific to your archive — one you own, host how you choose, and call via API.

The Flywheel

Each lap, fewer fields need a human — and the model gets cheaper to run.

reads better ↺

alma readsmachine estimate

you correcta human confirms

corrections → training dataevery fix captured

distilled modelyours to own

occupation ="shipwright"· est 0.71

Cycle 171%auto-confirmed−38% / page

occupation ="shipwright"· est 0.71
alma readsmachine estimate
you correcta human confirms
corrections → training dataevery fix captured
distilled modelyours to own

↺ reads better — next cycle

rising est = the machine reading better; only ✓ means a human confirmed it.

Call it from anywhere

One request. Cited fields back.

Invoke your own model over REST or MCP. Every field returns scored and evidence-linked.

alma.intergen.app/settings/keys

developer console

Issue scoped keys; watch every call land.

bash

$ curl -sX POST https://api.alma.intergen.app/v1/digitize \    -H "Authorization: Bearer $ALMA_KEY" \    -d '{ "document": "s3://intake/manifest-1907A.tif",          "schema": "ship_manifest.v3" }' # 200 OK{  "doc_id": "doc_3f9a2c",  "status": "unreviewed",  "fields": [    {      "key": "passenger.surname",      "value": "Kowalczyk",      "confidence": 0.91,      "evidence": {        "page": 3, "line": 12,        "bbox": [0.18, 0.42, 0.39, 0.46]      },      "status": "unreviewed"    }  ]}

passenger.surname ="Kowalczyk"· est 0.91· manifest p.3

est = machine score; a ✓ appears only after a reviewer accepts.

webhooks on escalation · RBAC scopes on every key

Built for archives that matter

Enterprise control, archival discipline.

RBAC at every step
Who can read, correct, and export — enforced on every action and every key.
Evidence-linked audit
A tamper-evident trail on every field: who, when, what changed, and why.
Self-host or VPC
Open-weight models you run in your own environment. Your data and model stay yours.

In production

Barbados Land Registry · historical land records
National archive · 40k handwritten pages
Title insurer · auditor-signed exports

“40,000 pages of handwriting we'd written off as un-digitizable — alma gave us a queue of 900 to actually check.”
Head of Records — National Archive
“The confidence scores are why our auditors signed off.”
VP Operations — Title Insurer
“We own the model now. That changed the procurement conversation.”
CIO — Ministry of Lands

Stop posting documents to an API. Hand them to a process.

Bring alma your hardest archive — the faded, the handwritten, the irregular. We'll verify it, field by field, and leave you a model you own.

Hand alma a document

Book a demo →

Hand it to a process,not an API.

Drag the seam. Watch ink become a dataset you can trust.

Five agents. One orchestrated pass.

Two readings. One value.

Built to read what OCR gives up on.

Digitize ships' manifests

See it in the app.

The Verifying Room

The review docket

Knowledge grounding

Per-field confidence

A model you own.

The learning loop

Enterprise control, archival discipline.

Stop posting documents to an API. Hand them to a process.