kdx document command provides a comprehensive set of tools for inspecting, searching, annotating, and extracting structured data from local Kodexa documents (KDDB files). These commands work offline without requiring a connection to the Kodexa platform.
What are KDDB Files?
KDDB (Kodexa Document Database) files are SQLite-based document containers that store:- Document structure - Hierarchical content nodes with types and content
- Metadata - Document properties like UUID, version, and custom fields
- Tags - Annotations on content nodes for extraction workflows
- Data objects - Structured extracted data with attributes
- Native files - Embedded binary files (PDFs, images, etc.)
- Audit trail - Full revision history of data changes
Reading & Analysis
Commands for understanding document content without modifying anything.Info
Document summary with metadata and statistics
Stats
Detailed statistics and node type breakdown
Text
Extract readable text with page markers
Grep & Lines
Search content with regex and retrieve lines
Find
Multi-criteria search (text, type, page, region)
Locate
Find nodes with match positions for tagging
Node
Inspect a single node by ID
Structure & Metadata
Commands for inspecting document structure, annotations, and spatial layout.Print & Select
Tree view and selector queries
Tags
List all tags with counts
Audit
Revision history and change tracking
Spatial
Bounding box queries and region search
Native Files
List and extract embedded files
External Data
Manage external data key-value store
Metadata
View and modify document metadata
Data & Tagging
Commands that modify the document by adding tags, creating data objects, and setting attributes.Agentic Workflows
Agentic CLI Use
How AI agents use
kdx document commands to analyze and annotate documents programmatically. Includes end-to-end workflow examples and best practices.Quick Start
Explore a Document
Search and Annotate
Scripting with JSON
Output Formats
All document commands support the global-o flag:
grep, find, locate) output JSON Lines (JSONL) by default for streaming. Use --pretty for readable output.