Skip to main content
AI agents (Claude Code, custom tooling, etc.) can use kdx document commands to programmatically analyze, search, annotate, and extract structured data from complex documents. The CLI’s JSON output format and composable command design make it ideal for agent workflows.

The Agent Workflow

A typical agent workflow follows this pipeline:
info → stats → text → grep/locate → tag → data create → data set-attribute
Each step narrows focus from document-level understanding down to precise node-level annotation.

Step 1: Understand the Document

# Get document overview
kdx document info doc.kddb

# Get quantitative summary
kdx document stats doc.kddb
The agent learns how many pages, nodes, tags, and data objects exist before diving in.

Step 2: Read Content

# Read specific pages
kdx document text doc.kddb --pages 1:3

# Read a single page
kdx document page 5 doc.kddb
The agent reads page content to understand what the document contains and identify sections of interest.

Step 3: Search for Content

# Regex search across the document
kdx document grep "revenue|income" doc.kddb

# Multi-criteria search
kdx document find doc.kddb --contains "total" --type line --page 3
These commands return JSON with node IDs and match positions for further processing.

Step 4: Locate Nodes for Tagging

# Find nodes with match positions
kdx document locate doc.kddb --pattern "\$[\d,]+\.\d{2}" --type word --max 10
The locate command returns nodeId, matchStart, matchEnd, and matchText - everything an agent needs for precise annotation.

Step 5: Tag Nodes

# Tag a node found by locate
kdx document tag doc.kddb --node-id 245 --name "invoice/amount" --value "$1,234.56"
The output includes a tagUuid that links the tag to the node for provenance tracking.

Step 6: Create Structured Data

# Create a data object
kdx document data create doc.kddb --path "INVOICE"

# Set attributes on the data object
kdx document data set-attribute doc.kddb \
  --object-id 1 --tag "total_amount" --value "1234.56" \
  --type CURRENCY --tag-uuid "a1b2c3d4-..."
The --tag-uuid flag links the attribute back to its source node in the document.

Example: Processing a Financial Document

This walkthrough shows how an agent would process a 50-page financial filing to extract key figures.

1. Assess the Document

$ kdx document info filing.kddb
{
  "uuid": "e25dab60-cbdf-499f-857e-ff9c82a19d87",
  "version": "6.0.0",
  "statistics": {
    "nodeCount": 12847,
    "pageCount": 50,
    "tagCount": 0
  }
}
The agent sees 50 pages with no existing tags - a fresh document to process.

2. Find Key Sections

$ kdx document grep "Total Revenue" filing.kddb --max 5
{"nodeId":4521,"type":"line","content":"Total Revenue  $45,678,000","page":12,"matchStart":0,"matchEnd":13}
{"nodeId":8932,"type":"line","content":"Total Revenue for Fiscal Year","page":28,"matchStart":0,"matchEnd":13}

3. Read the Revenue Page

$ kdx document text filing.kddb --pages 12:12
--- Page 12 ---
CONSOLIDATED STATEMENTS OF INCOME
(In thousands)

Total Revenue  $45,678,000
Cost of Goods Sold  $28,456,000
Gross Profit  $17,222,000
...

4. Locate Specific Values

$ kdx document locate filing.kddb --pattern "\$[\d,]+" --type word --page 12
{"nodeId":4530,"type":"word","content":"$45,678,000","page":12,"matchStart":0,"matchEnd":11,"matchText":"$45,678,000"}
{"nodeId":4538,"type":"word","content":"$28,456,000","page":12,"matchStart":0,"matchEnd":11,"matchText":"$28,456,000"}
{"nodeId":4545,"type":"word","content":"$17,222,000","page":12,"matchStart":0,"matchEnd":11,"matchText":"$17,222,000"}

5. Tag and Create Data

# Tag the revenue node
$ kdx document tag filing.kddb --node-id 4530 --name "financials/total_revenue"
{"nodeId":4530,"tag":"financials/total_revenue","status":"tagged","tagId":1,"tagUuid":"f1a2b3c4-d5e6-7890-abcd-ef1234567890"}
# Create data object
$ kdx document data create filing.kddb --path "FINANCIALS/INCOME_STATEMENT"
{"id":1,"path":"FINANCIALS/INCOME_STATEMENT"}
# Set attribute linked to source
$ kdx document data set-attribute filing.kddb \
    --object-id 1 --tag "total_revenue" --value "45678000" \
    --type CURRENCY --tag-uuid "f1a2b3c4-d5e6-7890-abcd-ef1234567890"
{"id":1,"dataObjectId":1,"tag":"total_revenue","value":"45678000","type":"CURRENCY"}

Output Format

All commands produce JSON Lines (JSONL) by default - one JSON object per line. This streams well and is easy for agents to parse line-by-line:
{"nodeId":100,"type":"word","content":"Revenue","page":1}
{"nodeId":101,"type":"word","content":"$1,234","page":1}
Use --pretty for human-readable debugging:
{
  "nodeId": 100,
  "type": "word",
  "content": "Revenue",
  "page": 1
}

Best Practices for Agent Developers

Limit Results

Always use --max to prevent overwhelming output on large documents:
kdx document locate doc.kddb --pattern ".*" --max 20

Focus by Page

Use --page to work on one page at a time instead of the entire document:
kdx document locate doc.kddb --pattern "amount" --page 5

Chain Commands

The intended workflow chains outputs from one command into the next:
  1. locate returns nodeId → use with tag --node-id
  2. tag returns tagUuid → use with data set-attribute --tag-uuid
  3. data create returns id → use with data set-attribute --object-id

Use Node Type Filters

Filter by node type to get the right granularity:
  • --type word for individual tokens (amounts, dates, names)
  • --type line for full lines of text
  • --type paragraph for paragraph-level content

Verify Before Writing

Use read-only commands (info, stats, text, grep, locate, node) to understand the document before using write commands (tag, data create, data set-attribute).

Inspect Nodes Before Tagging

Use node to verify a node’s content and context before tagging:
kdx document node 245 doc.kddb --tags --children

Command Reference

CommandModePurpose
infoReadDocument summary
statsReadDetailed statistics
textReadPage text extraction
grepReadRegex content search
findReadMulti-criteria search
locateReadNode discovery with match positions
nodeReadSingle node inspection
tagsReadList all tags
tagWriteAnnotate a node
data createWriteCreate data object
data set-attributeWriteSet attribute on data object
auditReadRevision history