Build on a pipeline that verifies, not just generates.
alma turns documents into structured, evidence-linked fields you can trust. Authenticate with a Bearer API key, call the Enterprise API v1 from any language, or wire the MCP tool surface straight into your agents. Every field comes back with confidence, review status, and page/bbox citations.
Overview
The alma API is organised around documents. A documentId is a document_instance — one logical record extracted from an uploaded file. Every endpoint returns JSON unless you explicitly request a binary export. All requests are made over HTTPS and authenticated with a Bearer API key.
Each field carries page + bbox citations and a calibrated confidence.
Field status tells you what a reviewer accepted, edited, or escalated.
Keys read only the documents your organisation can see.
Verified corrections become a dataset and, eventually, your own model.
Authentication
Machine-to-machine calls authenticate with a Bearer API key. Tokens look like alma_… and are shown to you exactly once at creation — alma stores only a salted hash, never the raw token. Send it in the Authorization header on every request.
Authorization: Bearer alma_7Qb3kP9wY2hC8tD0vN5xM1aZ6sR4fL2gJMint, list, and revoke keys from the admin console (admin role required). Each key carries a role that scopes what it can do.
$ curl https://api.alma.intergentech.ai/api/v1/models \
-H "Authorization: Bearer $ALMA_API_KEY"
# A 401 means the token is missing, malformed, or revoked.
# { "error": "unauthorized" }Digitize a document
Resolve a document into its structured fields. Pass the documentId in the body. Each field returns the best available value — the reviewer-accepted value where present, otherwise the highest-confidence track reading — alongside its review status and evidence.
$ curl -X POST https://api.alma.intergentech.ai/api/v1/digitize \
-H "Authorization: Bearer $ALMA_API_KEY" \
-H "Content-Type: application/json" \
-d '{ "documentId": 4012 }'{
"documentId": 4012,
"docType": "deed_of_conveyance",
"fields": [
{
"key": "grantor_name",
"value": "Josiah A. Greaves",
"confidence": 0.973,
"status": "accepted",
"evidence": {
"page": 1,
"bbox": { "x": 0.142, "y": 0.331, "w": 0.268, "h": 0.041 }
}
},
{
"key": "parcel_acreage",
"value": "12.5",
"confidence": 0.61,
"status": "escalated",
"evidence": {
"page": 2,
"bbox": { "x": 0.557, "y": 0.214, "w": 0.083, "h": 0.029 }
}
}
]
}status is one of unreviewed, accepted, edited, rejected, escalated. Errors: 400 (bad documentId), 401 (auth), 404 (not visible / not found).
Get a document
Fetch the same structured projection by id with a plain GET — handy for polling a document after upload or re-reading it later. The response shape is identical to /api/v1/digitize.
$ curl https://api.alma.intergentech.ai/api/v1/documents/4012 \
-H "Authorization: Bearer $ALMA_API_KEY"{
"documentId": 4012,
"docType": "deed_of_conveyance",
"fields": [
{
"key": "grantor_name",
"value": "Josiah A. Greaves",
"confidence": 0.973,
"status": "accepted",
"evidence": { "page": 1, "bbox": { "x": 0.142, "y": 0.331, "w": 0.268, "h": 0.041 } }
}
]
}Export a dataset
Download a document's accepted fields as a file. Choose the format with the query parameter; the response sets Content-Disposition so it saves with a sensible filename. Use jsonl to pull verified training pairs for the flywheel (see Bring your own model).
$ curl -L https://api.alma.intergentech.ai/api/v1/export/4012?format=csv \
-H "Authorization: Bearer $ALMA_API_KEY" \
-o document-4012.csvfield_key,value,confidence,page_number,bbox,source
grantor_name,Josiah A. Greaves,0.973,1,"{""x"":0.142,""y"":0.331,""w"":0.268,""h"":0.041}",accepted
parcel_acreage,12.5,0.61,2,"{""x"":0.557,""y"":0.214,""w"":0.083,""h"":0.029}",visionList custom models
List the fine-tuned models registered to your organisation. Each model has a stable id you can route extraction through (next section). Models you have not registered are never returned.
$ curl https://api.alma.intergentech.ai/api/v1/models \
-H "Authorization: Bearer $ALMA_API_KEY"{
"models": [
{
"id": "mdl_barbados_deeds_v3",
"name": "Barbados deeds — handwriting v3",
"baseModel": "alma-htr-base",
"status": "ready",
"trainedExamples": 4820,
"createdAt": "2026-05-14T09:21:00Z"
},
{
"id": "mdl_survey_plans_v1",
"name": "Survey plans — tables v1",
"baseModel": "alma-vlm-base",
"status": "training",
"trainedExamples": 1190,
"createdAt": "2026-06-20T16:02:00Z"
}
]
}Digitize with your model
Run the full pipeline on a document but route the LLM extraction step through one of your fine-tuned models. The response shape matches /api/v1/digitize — same fields, same evidence — extracted by a model tuned on your verified corrections. Only ready models can be invoked.
$ curl -X POST https://api.alma.intergentech.ai/api/v1/models/mdl_barbados_deeds_v3/digitize \
-H "Authorization: Bearer $ALMA_API_KEY" \
-H "Content-Type: application/json" \
-d '{ "documentId": 4012 }'{
"documentId": 4012,
"docType": "deed_of_conveyance",
"model": "mdl_barbados_deeds_v3",
"fields": [
{
"key": "grantor_name",
"value": "Josiah A. Greaves",
"confidence": 0.991,
"status": "unreviewed",
"evidence": { "page": 1, "bbox": { "x": 0.142, "y": 0.331, "w": 0.268, "h": 0.041 } }
}
]
}MCP tool surface
alma speaks the Model Context Protocol over a JSON-RPC 2.0 HTTP endpoint (Streamable-HTTP transport, POST only). Point an MCP-capable agent at it to give the model first-class tools for reading your archive. The handshake is initialize, discover tools with tools/list, and invoke them with tools/call.
Browse visible document instances
Structured fields for one document
json · csv · xlsx (base64)
$ curl -X POST https://api.alma.intergentech.ai/api/mcp \
-H "Authorization: Bearer $ALMA_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"jsonrpc": "2.0",
"id": 1,
"method": "tools/call",
"params": {
"name": "get_document_fields",
"arguments": { "documentId": 4012 }
}
}'{
"jsonrpc": "2.0",
"id": 1,
"result": {
"content": [
{
"type": "text",
"text": "{ \"documentId\": 4012, \"docType\": \"deed_of_conveyance\", \"fields\": [ ... ] }"
}
]
}
}The MCP server reports protocol version 2025-06-18 and supports initialize, ping, tools/list, and tools/call.
Bring your own model
Every correction a reviewer makes is a labelled example you own. alma turns that stream of verified fields into a model tuned to your documents — so accuracy compounds with use.
- 1Verify
Reviewers accept or edit fields in the workspace. Each verified field becomes a (page region → correct value) pair.
- 2Export the dataset
Pull verified pairs with /api/v1/export/{documentId}?format=jsonl — image region, bbox, field key, doc type, and the human-verified text.
- 3Register a fine-tuned model
Train on your dataset and register the result. It appears in /api/v1/models with a stable id once status is ready.
- 4Call your model
Route extraction through POST /api/v1/models/{id}/digitize. Same evidence-linked output — now read by a model that knows your archive.
$ curl -L "https://api.alma.intergentech.ai/api/v1/export/4012?format=jsonl" \
-H "Authorization: Bearer $ALMA_API_KEY" \
-o deeds.jsonl
# Each line is one verified example:
# {"text":"Josiah A. Greaves","field_key":"grantor_name","doc_type":"deed_of_conveyance","page_number":1,"bbox":{"x":0.142,"y":0.331,"w":0.268,"h":0.041},"image_key":"pages/4012/p1.png","status":"accepted"}The data, the corrections, and the resulting model are yours — self-hostable in your own environment. alma is the process that gets you there.