Developer portal

Build on a pipeline that verifies, not just generates.

alma turns documents into structured, evidence-linked fields you can trust. Authenticate with a Bearer API key, call the Enterprise API v1 from any language, or wire the MCP tool surface straight into your agents. Every field comes back with confidence, review status, and page/bbox citations.

Start here

Overview

The alma API is organised around documents. A documentId is a document_instance — one logical record extracted from an uploaded file. Every endpoint returns JSON unless you explicitly request a binary export. All requests are made over HTTPS and authenticated with a Bearer API key.

Evidence-linked

Each field carries page + bbox citations and a calibrated confidence.

Human-in-the-loop

Field status tells you what a reviewer accepted, edited, or escalated.

RLS everywhere

Keys read only the documents your organisation can see.

Yours to own

Verified corrections become a dataset and, eventually, your own model.

Security

Authentication

Machine-to-machine calls authenticate with a Bearer API key. Tokens look like alma_… and are shown to you exactly once at creation — alma stores only a salted hash, never the raw token. Send it in the Authorization header on every request.

Authorization header·http
Authorization: Bearer alma_7Qb3kP9wY2hC8tD0vN5xM1aZ6sR4fL2gJ

Mint, list, and revoke keys from the admin console (admin role required). Each key carries a role that scopes what it can do.

Quick check·bash
$ curl https://api.alma.intergentech.ai/api/v1/models \
    -H "Authorization: Bearer $ALMA_API_KEY"

# A 401 means the token is missing, malformed, or revoked.
# { "error": "unauthorized" }
Enterprise API v1

Digitize a document

POST/api/v1/digitize

Resolve a document into its structured fields. Pass the documentId in the body. Each field returns the best available value — the reviewer-accepted value where present, otherwise the highest-confidence track reading — alongside its review status and evidence.

Request·bash
$ curl -X POST https://api.alma.intergentech.ai/api/v1/digitize \
    -H "Authorization: Bearer $ALMA_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{ "documentId": 4012 }'
200 OK·json
{
  "documentId": 4012,
  "docType": "deed_of_conveyance",
  "fields": [
    {
      "key": "grantor_name",
      "value": "Josiah A. Greaves",
      "confidence": 0.973,
      "status": "accepted",
      "evidence": {
        "page": 1,
        "bbox": { "x": 0.142, "y": 0.331, "w": 0.268, "h": 0.041 }
      }
    },
    {
      "key": "parcel_acreage",
      "value": "12.5",
      "confidence": 0.61,
      "status": "escalated",
      "evidence": {
        "page": 2,
        "bbox": { "x": 0.557, "y": 0.214, "w": 0.083, "h": 0.029 }
      }
    }
  ]
}

status is one of unreviewed, accepted, edited, rejected, escalated. Errors: 400 (bad documentId), 401 (auth), 404 (not visible / not found).

Enterprise API v1

Get a document

GET/api/v1/documents/{id}

Fetch the same structured projection by id with a plain GET — handy for polling a document after upload or re-reading it later. The response shape is identical to /api/v1/digitize.

Request·bash
$ curl https://api.alma.intergentech.ai/api/v1/documents/4012 \
    -H "Authorization: Bearer $ALMA_API_KEY"
200 OK·json
{
  "documentId": 4012,
  "docType": "deed_of_conveyance",
  "fields": [
    {
      "key": "grantor_name",
      "value": "Josiah A. Greaves",
      "confidence": 0.973,
      "status": "accepted",
      "evidence": { "page": 1, "bbox": { "x": 0.142, "y": 0.331, "w": 0.268, "h": 0.041 } }
    }
  ]
}
Enterprise API v1

Export a dataset

GET/api/v1/export/{documentId}?format=csv|xlsx|xml|json|jsonl

Download a document's accepted fields as a file. Choose the format with the query parameter; the response sets Content-Disposition so it saves with a sensible filename. Use jsonl to pull verified training pairs for the flywheel (see Bring your own model).

csvtext/csv
xlsxspreadsheet (binary)
xmlapplication/xml
jsonapplication/json
jsonlone example per line
Request·bash
$ curl -L https://api.alma.intergentech.ai/api/v1/export/4012?format=csv \
    -H "Authorization: Bearer $ALMA_API_KEY" \
    -o document-4012.csv
document-4012.csv·csv
field_key,value,confidence,page_number,bbox,source
grantor_name,Josiah A. Greaves,0.973,1,"{""x"":0.142,""y"":0.331,""w"":0.268,""h"":0.041}",accepted
parcel_acreage,12.5,0.61,2,"{""x"":0.557,""y"":0.214,""w"":0.083,""h"":0.029}",vision
Custom models

List custom models

GET/api/v1/models

List the fine-tuned models registered to your organisation. Each model has a stable id you can route extraction through (next section). Models you have not registered are never returned.

Request·bash
$ curl https://api.alma.intergentech.ai/api/v1/models \
    -H "Authorization: Bearer $ALMA_API_KEY"
200 OK·json
{
  "models": [
    {
      "id": "mdl_barbados_deeds_v3",
      "name": "Barbados deeds — handwriting v3",
      "baseModel": "alma-htr-base",
      "status": "ready",
      "trainedExamples": 4820,
      "createdAt": "2026-05-14T09:21:00Z"
    },
    {
      "id": "mdl_survey_plans_v1",
      "name": "Survey plans — tables v1",
      "baseModel": "alma-vlm-base",
      "status": "training",
      "trainedExamples": 1190,
      "createdAt": "2026-06-20T16:02:00Z"
    }
  ]
}
Custom models

Digitize with your model

POST/api/v1/models/{id}/digitize

Run the full pipeline on a document but route the LLM extraction step through one of your fine-tuned models. The response shape matches /api/v1/digitize — same fields, same evidence — extracted by a model tuned on your verified corrections. Only ready models can be invoked.

Request·bash
$ curl -X POST https://api.alma.intergentech.ai/api/v1/models/mdl_barbados_deeds_v3/digitize \
    -H "Authorization: Bearer $ALMA_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{ "documentId": 4012 }'
200 OK·json
{
  "documentId": 4012,
  "docType": "deed_of_conveyance",
  "model": "mdl_barbados_deeds_v3",
  "fields": [
    {
      "key": "grantor_name",
      "value": "Josiah A. Greaves",
      "confidence": 0.991,
      "status": "unreviewed",
      "evidence": { "page": 1, "bbox": { "x": 0.142, "y": 0.331, "w": 0.268, "h": 0.041 } }
    }
  ]
}
For agents

MCP tool surface

POST/api/mcp

alma speaks the Model Context Protocol over a JSON-RPC 2.0 HTTP endpoint (Streamable-HTTP transport, POST only). Point an MCP-capable agent at it to give the model first-class tools for reading your archive. The handshake is initialize, discover tools with tools/list, and invoke them with tools/call.

list_documents

Browse visible document instances

get_document_fields

Structured fields for one document

export_document

json · csv · xlsx (base64)

tools/call·bash
$ curl -X POST https://api.alma.intergentech.ai/api/mcp \
    -H "Authorization: Bearer $ALMA_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "jsonrpc": "2.0",
      "id": 1,
      "method": "tools/call",
      "params": {
        "name": "get_document_fields",
        "arguments": { "documentId": 4012 }
      }
    }'
200 OK·json
{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "content": [
      {
        "type": "text",
        "text": "{ \"documentId\": 4012, \"docType\": \"deed_of_conveyance\", \"fields\": [ ... ] }"
      }
    ]
  }
}

The MCP server reports protocol version 2025-06-18 and supports initialize, ping, tools/list, and tools/call.

The flywheel

Bring your own model

Every correction a reviewer makes is a labelled example you own. alma turns that stream of verified fields into a model tuned to your documents — so accuracy compounds with use.

  1. 1
    Verify

    Reviewers accept or edit fields in the workspace. Each verified field becomes a (page region → correct value) pair.

  2. 2
    Export the dataset

    Pull verified pairs with /api/v1/export/{documentId}?format=jsonl — image region, bbox, field key, doc type, and the human-verified text.

  3. 3
    Register a fine-tuned model

    Train on your dataset and register the result. It appears in /api/v1/models with a stable id once status is ready.

  4. 4
    Call your model

    Route extraction through POST /api/v1/models/{id}/digitize. Same evidence-linked output — now read by a model that knows your archive.

Export verified training pairs·bash
$ curl -L "https://api.alma.intergentech.ai/api/v1/export/4012?format=jsonl" \
    -H "Authorization: Bearer $ALMA_API_KEY" \
    -o deeds.jsonl

# Each line is one verified example:
# {"text":"Josiah A. Greaves","field_key":"grantor_name","doc_type":"deed_of_conveyance","page_number":1,"bbox":{"x":0.142,"y":0.331,"w":0.268,"h":0.041},"image_key":"pages/4012/p1.png","status":"accepted"}

The data, the corrections, and the resulting model are yours — self-hostable in your own environment. alma is the process that gets you there.