Early Access GitOps sync functionality and metadata structure may evolve. We recommend testing sync operations in non-production environments first.
Overview
The kdx sync command enables GitOps workflows for Kodexa platform metadata. Store your organization and project configurations in version control, review changes through pull requests, and promote configurations across environments with confidence.
Key Benefits
Version Control Track all metadata changes in Git with full history and audit trails
Code Review Review infrastructure changes through pull requests before deployment
Multi-Environment Promote tested configurations from dev → staging → production
Offline Validation Validate changes locally before pushing to remote environments
Repository Layout
Metadata is organized under a kodexa-resources/ root directory with a hierarchical structure:
project-root/
├── sync-config.yaml # Configuration: environments, targets, branch mappings
├── manifests/
│ ├── shared.yaml # Shared resources across divisions
│ ├── division-a.yaml # Division-specific resources
│ └── division-b.yaml
└── kodexa-resources/
├── data-definitions/
│ ├── invoice-taxonomy.yaml
│ └── contract-taxonomy.yaml
├── data-stores/
│ ├── document-store.yaml
│ └── vector-store.yaml
├── knowledge-item-types/
│ └── conditional-knowledge.yaml
├── knowledge-feature-types/
│ └── text-extraction.yaml
├── knowledge-feature-instances/
│ ├── ocr-extraction-prod.yaml
│ └── fast-extraction-dev.yaml
├── modules/
│ └── classifier-v3/
│ └── module.yml
├── prompt-templates/
│ └── invoice-analysis.yaml
└── projects/
└── finance-automation/
├── project.yaml
├── knowledge-items/
│ └── tax-calculation.yaml
└── assistants/
└── invoice-assistant.yaml
Organization-Level Resources
These resources are shared across multiple projects:
Data Definitions - Classification hierarchies and metadata schemas (formerly taxonomies)
Data Stores - Document, vector, or graph storage configurations
Knowledge Item Types - Templates for rule definitions
Knowledge Feature Types - Generic capability definitions
Knowledge Feature Instances - Environment-specific deployments of features
Modules - ML models and custom modules for classification or extraction
Prompt Templates - Reusable prompts for AI assistants
Project-Level Resources
Project-specific configurations:
Project metadata - References to shared resources and environment mappings
Knowledge Items - Project-specific knowledge implementations
Assistants - AI assistant configurations
Configuration Architecture
The new sync system uses a three-tier architecture :
1. Environments (WHERE to deploy)
Environments define deployment target locations with authentication:
environments :
dev :
url : https://dev.kodexa.ai
api_key_env : KODEXA_DEV_API_KEY
staging :
url : https://staging.kodexa.ai
api_key_env : KODEXA_STAGING_API_KEY
prod :
url : https://kodexa.ai
api_key_env : KODEXA_PROD_API_KEY
API keys are never stored in configuration files. They’re referenced via environment variables for security.
2. Targets (WHAT to deploy)
Targets bundle an organization with its manifests:
targets :
shared :
organization : acme-shared
manifests :
- manifests/shared.yaml
division-a :
organization : acme-division-a
manifests :
- manifests/shared.yaml # Can reuse shared resources
- manifests/division-a.yaml # Plus division-specific
3. Branch Mappings (WHEN to deploy)
Branch mappings automate deployment based on git branches:
branch_mappings :
# Production: deploy all targets
- pattern : "main"
targets :
- target : shared
environment : prod
- target : division-a
environment : prod
# Staging: deploy for testing
- pattern : "release/*"
targets :
- target : shared
- target : division-a
environment : staging
# Development: division-specific
- pattern : "feature/div-a/*"
target : division-a
environment : dev
Configuration File
sync-config.yaml
Complete configuration example combining all three tiers:
schema_version : "2.0"
# WHERE: Define deployment environments
environments :
dev :
url : https://dev.kodexa.ai
api_key_env : KODEXA_DEV_API_KEY
staging :
url : https://staging.kodexa.ai
api_key_env : KODEXA_STAGING_API_KEY
prod :
url : https://kodexa.ai
api_key_env : KODEXA_PROD_API_KEY
# WHAT: Define deployment targets
targets :
shared :
organization : acme-shared
manifests :
- manifests/shared.yaml
division-a :
organization : acme-division-a
manifests :
- manifests/shared.yaml
- manifests/division-a.yaml
# WHEN: Define branch-based automation
branch_mappings :
- pattern : "main"
targets :
- target : shared
environment : prod
- target : division-a
environment : prod
- pattern : "release/*"
targets :
- target : shared
- target : division-a
environment : staging
- pattern : "feature/div-a/*"
target : division-a
environment : dev
Configuration options:
schema_version - Config format version (currently “2.0”)
environments - Map of environment name to URL and API key environment variable
targets - Map of target name to organization slug and manifest files
branch_mappings - List of git branch patterns with deployment rules
Manifests
Manifests define which resources to deploy as part of a target. They support resource type aliases for flexibility:
Example: shared.yaml
manifest_version : "1.0"
resources :
# Use friendly aliases
data-definitions :
- invoice-taxonomy
- contract-taxonomy
# Or canonical names
knowledgefeaturetype :
- document-metadata
- text-extraction
# Stores for shared data
stores :
- document-store
- vector-store
modules :
# Relative to sync-config.yaml directory
- modules/invoice-classifier/module.yml
- modules/contract-analyzer/module.yml
Example: division-a.yaml
manifest_version : "1.0"
resources :
# Division-specific resources
data-definitions :
- division-a-taxonomy
knowledge-feature-instances :
- division-a-ocr-prod
- division-a-extraction-dev
projects :
- finance-automation
- invoice-processing
modules :
- modules/division-a-parser/module.yml
Supported resource types (with common aliases):
Type Aliases datadefinitiondata-definition, taxonomystoredatastore, data-storeknowledgetypeknowledge-type, knowledge-item-typeknowledgefeaturetypeknowledge-feature-type, feature-typeknowledgefeatureinstanceknowledge-feature-instance, feature-instancemodulecloud-module, modelprompttemplateprompt-templateassistant- project- workflow-
Sync Commands
Deploy (GitOps Primary Command)
The deploy command automatically determines what to deploy based on your current git branch:
# Auto-deploy based on current git branch
kdx sync deploy
# Preview without making changes
kdx sync deploy --dry-run
# Ask for confirmation before deploying
kdx sync deploy --confirm-all
# Ask before each target deployment
kdx sync deploy --confirm-each
How it works:
Detects current git branch
Matches branch against branch_mappings patterns (first match wins)
Deploys all specified targets to their environments
Reports results for each target
Manual deployment (override branch detection):
# Deploy specific target to specific environment
kdx sync deploy --target shared --env prod
# Deploy multiple targets
kdx sync deploy --target division-a --env dev
kdx sync deploy --target division-b --env dev
Download resources from an environment to local files:
# Pull resources defined in a target from an environment
kdx sync pull --target shared --env dev
# Pull from different environment
kdx sync pull --target division-a --env prod
What pull does:
Reads target’s manifests to determine what resources to fetch
Connects to source environment
Downloads each resource from the API
Writes YAML files to local filesystem (default: kodexa-resources/)
Upload local resources to an environment:
# Push resources to environment
kdx sync push --target shared --env prod
# Dry run (validate without pushing)
kdx sync push --target division-a --env dev --dry-run
What push does:
Reads target’s manifests to determine what to upload
Validates all resources exist locally
Connects to destination environment
Uploads each resource (creates or updates)
Reports status for each resource
Validation
All commands perform offline validation before contacting the API:
Resource files exist and contain valid YAML
Required fields are present per resource type
Schema compliance with OpenAPI specifications
Module paths are valid
Slug consistency between manifests and files
Validation failures abort the command with actionable error messages. Fix the issues and re-run.
Common Workflows
Initial Setup
Set up GitOps for an existing environment:
# Initialize Git repository
mkdir kodexa-infra && cd kodexa-infra
git init
# Set environment variables for authentication
export KODEXA_DEV_API_KEY = "your-dev-api-key"
export KODEXA_PROD_API_KEY = "your-prod-api-key"
# Create sync configuration
cat > sync-config.yaml << 'EOF'
schema_version: "2.0"
environments:
dev:
url: https://dev.kodexa.ai
api_key_env: KODEXA_DEV_API_KEY
prod:
url: https://kodexa.ai
api_key_env: KODEXA_PROD_API_KEY
targets:
my-org-dev:
organization: my-org
manifests:
- manifests/main.yaml
branch_mappings:
- pattern: "main"
target: my-org-dev
environment: dev
EOF
# Create manifest
mkdir -p manifests
cat > manifests/main.yaml << 'EOF'
manifest_version: "1.0"
resources:
data-definitions:
- invoice
- contract
modules:
- invoice-processor
EOF
# Pull existing resources from dev
kdx sync pull --target my-org-dev --env dev
# Commit to version control
git add .
git commit -m "Initial metadata snapshot from dev environment"
git remote add origin git@github.com:company/kodexa-infra.git
git push -u origin main
Development Workflow
Make and test changes locally:
# Create feature branch
git checkout -b feature/add-new-taxonomy
# Pull latest metadata from dev
kdx sync pull --target my-org-dev --env dev
# Make changes to resources
vim kodexa-resources/data-definitions/new-taxonomy.yaml
# Update manifest to include new resource
vim manifests/main.yaml # Add 'new-taxonomy' to data-definitions
# Validate changes locally
kdx sync push --target my-org-dev --env dev --dry-run
# Push to dev for testing
kdx sync push --target my-org-dev --env dev
# If validation passes, commit
git add .
git commit -m "Add new document classification taxonomy"
git push origin feature/add-new-taxonomy
Promote changes across environments using branch mappings:
# Configure multi-environment setup
cat > sync-config.yaml << 'EOF'
schema_version: "2.0"
environments:
dev:
url: https://dev.kodexa.ai
api_key_env: KODEXA_DEV_API_KEY
staging:
url: https://staging.kodexa.ai
api_key_env: KODEXA_STAGING_API_KEY
prod:
url: https://kodexa.ai
api_key_env: KODEXA_PROD_API_KEY
targets:
my-org:
organization: my-org
manifests:
- manifests/main.yaml
branch_mappings:
- pattern: "feature/*"
target: my-org
environment: dev
- pattern: "release/*"
target: my-org
environment: staging
- pattern: "main"
target: my-org
environment: prod
EOF
# Development on feature branch
git checkout -b feature/enhancements
# ... make changes ...
kdx sync deploy # Auto-deploys to dev
# Create release branch for staging
git checkout -b release/v1.0
kdx sync deploy # Auto-deploys to staging
# After validation, merge to main
git checkout main
git merge release/v1.0
kdx sync deploy # Auto-deploys to prod
Multi-Tenant Deployment
Deploy to multiple organizations from one repository:
# sync-config.yaml
schema_version : "2.0"
environments :
prod :
url : https://kodexa.ai
api_key_env : KODEXA_PROD_API_KEY
targets :
shared :
organization : acme-shared
manifests :
- manifests/shared.yaml
division-a :
organization : acme-division-a
manifests :
- manifests/shared.yaml
- manifests/division-a.yaml
division-b :
organization : acme-division-b
manifests :
- manifests/shared.yaml
- manifests/division-b.yaml
branch_mappings :
- pattern : "main"
targets :
- target : shared
- target : division-a
- target : division-b
environment : prod
# Deploy all three organizations with one command
kdx sync deploy
# Or deploy specific division
kdx sync deploy --target division-a --env prod
Pull Request Review
# Reviewer checks out branch
git checkout feature/add-new-taxonomy
# Validate locally without deploying
kdx sync deploy --dry-run
# Review changes
git diff main...feature/add-new-taxonomy
# Approve and merge if valid
Rollback Changes
# Revert to previous version
git revert HEAD
# Or reset to specific commit
git reset --hard abc1234
# Push reverted state back to environment
kdx sync deploy --env prod --target my-org
Advanced Usage
Manifest Composition
Combine multiple manifests for flexible resource grouping:
targets :
division-a :
organization : acme-division-a
manifests :
- manifests/shared.yaml # Core resources used by all
- manifests/common-types.yaml # Common to division A
- manifests/division-a.yaml # Division A specific
This allows:
Shared resources defined once, reused across divisions
Team-specific resources isolated
Easy addition/removal of resources
Environment-Specific Branch Mappings
Use branch patterns to control deployment destinations:
branch_mappings :
# Main branch → production only
- pattern : "main"
targets :
- target : shared
environment : prod
- target : division-a
environment : prod
# Release branches → staging for all
- pattern : "release/*"
targets :
- target : shared
- target : division-a
environment : staging
# Feature branches → dev, but only specific division
- pattern : "feature/div-a/*"
target : division-a
environment : dev
- pattern : "feature/div-b/*"
target : division-b
environment : dev
CI/CD Integration
Automate sync using GitHub Actions:
name : Deploy to Kodexa
on :
push :
branches : [ main , 'release/*' , 'feature/*' ]
jobs :
deploy :
runs-on : ubuntu-latest
steps :
- uses : actions/checkout@v4
- name : Install KDX CLI
run : |
# Install via homebrew or download binary
brew install kodexa-ai/tap/kdx
- name : Deploy
env :
KODEXA_DEV_API_KEY : ${{ secrets.KODEXA_DEV_API_KEY }}
KODEXA_STAGING_API_KEY : ${{ secrets.KODEXA_STAGING_API_KEY }}
KODEXA_PROD_API_KEY : ${{ secrets.KODEXA_PROD_API_KEY }}
run : |
kdx sync deploy --confirm-all
Benefits of this approach:
Single workflow for all environments
Branch-based automatic routing
No profile setup needed
Clean secrets management via environment variables
Resource Path Overrides
Control where resource files are stored:
# manifest.yaml
manifest_version : "1.0"
resources :
# Standard location: kodexa-resources/data-definitions/standard.yaml
- type : data-definition
slug : standard
# Custom location
- type : data-definition
slug : special
path : custom-dir/special-taxonomy.yaml
# Absolute path
- type : store
slug : central-store
path : /shared/stores/central.yaml
Troubleshooting
”Environment not found”
Ensure the environment is defined in sync-config.yaml:
# Check your configuration
cat sync-config.yaml | grep -A 3 "environments:"
# Verify environment name matches exactly (case-sensitive)
Resolution:
environments :
dev : # Must match --env value exactly
url : https://dev.kodexa.ai
api_key_env : KODEXA_DEV_API_KEY
“API key environment variable not set”
The referenced environment variable must be exported:
# Check if variable is set
echo $KODEXA_DEV_API_KEY
# Set it in your shell
export KODEXA_DEV_API_KEY = "your-api-key-here"
# Or add to your shell profile (~/.bashrc or ~/.zshrc)
echo 'export KODEXA_DEV_API_KEY="your-api-key"' >> ~/.bashrc
“Target not found”
Verify target is defined in sync-config.yaml:
# Check targets section
cat sync-config.yaml | grep -A 5 "targets:"
Resolution:
targets :
my-org-dev : # Must match --target value exactly
organization : my-org
manifests :
- manifests/main.yaml
“No branch mapping found”
Either specify target/environment manually or add a branch mapping:
# Manual override (no branch mapping needed)
kdx sync deploy --target my-org-dev --env dev
# Or add branch mapping
Resolution:
branch_mappings :
- pattern : "feature/test" # Exact match
target : my-org-dev
environment : dev
- pattern : "feature/*" # Wildcard pattern
target : my-org-dev
environment : dev
API Authentication Failed
Check your API key is correct and has proper permissions:
# Verify environment variable is set
echo $KODEXA_DEV_API_KEY
# Test API access directly
curl -H "Authorization: Bearer $KODEXA_DEV_API_KEY " \
https://dev.kodexa.ai/api/workspaces
# Update API key if needed
export KODEXA_DEV_API_KEY = "new-api-key"
Changes Not Syncing
Clear cache and verify resource files exist:
# Check resource files exist
ls -la kodexa-resources/data-definitions/
# Verify manifest references correct slugs
cat manifests/main.yaml
# Check for validation errors
kdx sync push --target my-org --env dev --dry-run
Merge Conflicts
Resolve conflicts manually:
# Pull latest from environment
kdx sync pull --target my-org --env prod
# Resolve conflicts in YAML files
vim kodexa-resources/data-definitions/conflicted-file.yaml
# Test resolution
kdx sync push --target my-org --env prod --dry-run
# Commit resolution
git add .
git commit -m "Resolve merge conflict in taxonomy"
Best Practices
1. Use Dry Run First
Always validate before pushing:
kdx sync deploy --dry-run
# Review output
kdx sync deploy --confirm-all
2. Commit After Each Pull
Track what changed in the remote environment:
kdx sync pull --target my-org --env prod
git add .
git commit -m "Sync from prod: $( date )"
3. Small, Focused Changes
Make incremental changes rather than large batches:
# Good: one taxonomy at a time
git commit -m "Add invoice classification taxonomy"
# Avoid: many unrelated changes
git commit -m "Update all taxonomies, stores, and projects"
4. Test in Lower Environments
Test changes in dev/staging before production:
# Test in dev
kdx sync push --target my-org --env dev
# ... test ...
# Promote to staging
kdx sync push --target my-org --env staging
# ... test ...
# Finally production
kdx sync deploy --target my-org --env prod
5. Document Dependencies
Add comments to YAML files:
# project.yaml
slug : invoice-automation
name : Invoice Automation
# Depends on:
# - data-definition: invoice-fields
# - store: invoice-documents
# - module: invoice-classifier-v2
6. Use Meaningful Commit Messages
# Good
git commit -m "Add OCR feature instance for production environment"
# Better
git commit -m "feat: add OCR feature instance for production
- Configured with high-accuracy model
- Enabled for invoice-automation project
- Memory limit set to 4GB"
Next Steps