Skip to main content
Early AccessGitOps sync functionality and metadata structure may evolve. We recommend testing sync operations in non-production environments first.

Overview

The kdx sync command enables GitOps workflows for Kodexa platform metadata. Store your organization and project configurations in version control, review changes through pull requests, and promote configurations across environments with confidence.

Key Benefits

Version Control

Track all metadata changes in Git with full history and audit trails

Code Review

Review infrastructure changes through pull requests before deployment

Multi-Environment

Promote tested configurations from dev → staging → production

Offline Validation

Validate changes locally before pushing to remote environments

Repository Layout

Metadata is organized under a kodexa-resources/ root directory with a hierarchical structure:
project-root/
├── sync-config.yaml                     # Configuration: environments, targets, branch mappings
├── manifests/
│   ├── shared.yaml                     # Shared resources across divisions
│   ├── division-a.yaml                 # Division-specific resources
│   └── division-b.yaml
└── kodexa-resources/
    ├── data-definitions/
    │   ├── invoice-taxonomy.yaml
    │   └── contract-taxonomy.yaml
    ├── data-stores/
    │   ├── document-store.yaml
    │   └── vector-store.yaml
    ├── knowledge-item-types/
    │   └── conditional-knowledge.yaml
    ├── knowledge-feature-types/
    │   └── text-extraction.yaml
    ├── knowledge-feature-instances/
    │   ├── ocr-extraction-prod.yaml
    │   └── fast-extraction-dev.yaml
    ├── modules/
    │   └── classifier-v3/
    │       └── module.yml
    ├── prompt-templates/
    │   └── invoice-analysis.yaml
    └── projects/
        └── finance-automation/
            ├── project.yaml
            ├── knowledge-items/
            │   └── tax-calculation.yaml
            └── assistants/
                └── invoice-assistant.yaml

Organization-Level Resources

These resources are shared across multiple projects:
  • Data Definitions - Classification hierarchies and metadata schemas (formerly taxonomies)
  • Data Stores - Document, vector, or graph storage configurations
  • Knowledge Item Types - Templates for rule definitions
  • Knowledge Feature Types - Generic capability definitions
  • Knowledge Feature Instances - Environment-specific deployments of features
  • Modules - ML models and custom modules for classification or extraction
  • Prompt Templates - Reusable prompts for AI assistants

Project-Level Resources

Project-specific configurations:
  • Project metadata - References to shared resources and environment mappings
  • Knowledge Items - Project-specific knowledge implementations
  • Assistants - AI assistant configurations

Configuration Architecture

The new sync system uses a three-tier architecture:

1. Environments (WHERE to deploy)

Environments define deployment target locations with authentication:
environments:
  dev:
    url: https://dev.kodexa.ai
    api_key_env: KODEXA_DEV_API_KEY

  staging:
    url: https://staging.kodexa.ai
    api_key_env: KODEXA_STAGING_API_KEY

  prod:
    url: https://kodexa.ai
    api_key_env: KODEXA_PROD_API_KEY
API keys are never stored in configuration files. They’re referenced via environment variables for security.

2. Targets (WHAT to deploy)

Targets bundle an organization with its manifests:
targets:
  shared:
    organization: acme-shared
    manifests:
      - manifests/shared.yaml

  division-a:
    organization: acme-division-a
    manifests:
      - manifests/shared.yaml      # Can reuse shared resources
      - manifests/division-a.yaml  # Plus division-specific

3. Branch Mappings (WHEN to deploy)

Branch mappings automate deployment based on git branches:
branch_mappings:
  # Production: deploy all targets
  - pattern: "main"
    targets:
      - target: shared
        environment: prod
      - target: division-a
        environment: prod

  # Staging: deploy for testing
  - pattern: "release/*"
    targets:
      - target: shared
      - target: division-a
    environment: staging

  # Development: division-specific
  - pattern: "feature/div-a/*"
    target: division-a
    environment: dev

Configuration File

sync-config.yaml

Complete configuration example combining all three tiers:
schema_version: "2.0"

# WHERE: Define deployment environments
environments:
  dev:
    url: https://dev.kodexa.ai
    api_key_env: KODEXA_DEV_API_KEY

  staging:
    url: https://staging.kodexa.ai
    api_key_env: KODEXA_STAGING_API_KEY

  prod:
    url: https://kodexa.ai
    api_key_env: KODEXA_PROD_API_KEY

# WHAT: Define deployment targets
targets:
  shared:
    organization: acme-shared
    manifests:
      - manifests/shared.yaml

  division-a:
    organization: acme-division-a
    manifests:
      - manifests/shared.yaml
      - manifests/division-a.yaml

# WHEN: Define branch-based automation
branch_mappings:
  - pattern: "main"
    targets:
      - target: shared
        environment: prod
      - target: division-a
        environment: prod

  - pattern: "release/*"
    targets:
      - target: shared
      - target: division-a
    environment: staging

  - pattern: "feature/div-a/*"
    target: division-a
    environment: dev
Configuration options:
  • schema_version - Config format version (currently “2.0”)
  • environments - Map of environment name to URL and API key environment variable
  • targets - Map of target name to organization slug and manifest files
  • branch_mappings - List of git branch patterns with deployment rules

Manifests

Manifests define which resources to deploy as part of a target. They support resource type aliases for flexibility:

Example: shared.yaml

manifest_version: "1.0"

resources:
  # Use friendly aliases
  data-definitions:
    - invoice-taxonomy
    - contract-taxonomy

  # Or canonical names
  knowledgefeaturetype:
    - document-metadata
    - text-extraction

  # Stores for shared data
  stores:
    - document-store
    - vector-store

modules:
  # Relative to sync-config.yaml directory
  - modules/invoice-classifier/module.yml
  - modules/contract-analyzer/module.yml

Example: division-a.yaml

manifest_version: "1.0"

resources:
  # Division-specific resources
  data-definitions:
    - division-a-taxonomy

  knowledge-feature-instances:
    - division-a-ocr-prod
    - division-a-extraction-dev

  projects:
    - finance-automation
    - invoice-processing

modules:
  - modules/division-a-parser/module.yml
Supported resource types (with common aliases):
TypeAliases
datadefinitiondata-definition, taxonomy
storedatastore, data-store
knowledgetypeknowledge-type, knowledge-item-type
knowledgefeaturetypeknowledge-feature-type, feature-type
knowledgefeatureinstanceknowledge-feature-instance, feature-instance
modulecloud-module, model
prompttemplateprompt-template
assistant-
project-
workflow-

Sync Commands

Deploy (GitOps Primary Command)

The deploy command automatically determines what to deploy based on your current git branch:
# Auto-deploy based on current git branch
kdx sync deploy

# Preview without making changes
kdx sync deploy --dry-run

# Ask for confirmation before deploying
kdx sync deploy --confirm-all

# Ask before each target deployment
kdx sync deploy --confirm-each
How it works:
  1. Detects current git branch
  2. Matches branch against branch_mappings patterns (first match wins)
  3. Deploys all specified targets to their environments
  4. Reports results for each target
Manual deployment (override branch detection):
# Deploy specific target to specific environment
kdx sync deploy --target shared --env prod

# Deploy multiple targets
kdx sync deploy --target division-a --env dev
kdx sync deploy --target division-b --env dev

Pull Metadata

Download resources from an environment to local files:
# Pull resources defined in a target from an environment
kdx sync pull --target shared --env dev

# Pull from different environment
kdx sync pull --target division-a --env prod
What pull does:
  1. Reads target’s manifests to determine what resources to fetch
  2. Connects to source environment
  3. Downloads each resource from the API
  4. Writes YAML files to local filesystem (default: kodexa-resources/)

Push Metadata

Upload local resources to an environment:
# Push resources to environment
kdx sync push --target shared --env prod

# Dry run (validate without pushing)
kdx sync push --target division-a --env dev --dry-run
What push does:
  1. Reads target’s manifests to determine what to upload
  2. Validates all resources exist locally
  3. Connects to destination environment
  4. Uploads each resource (creates or updates)
  5. Reports status for each resource

Validation

All commands perform offline validation before contacting the API:
  • Resource files exist and contain valid YAML
  • Required fields are present per resource type
  • Schema compliance with OpenAPI specifications
  • Module paths are valid
  • Slug consistency between manifests and files
Validation failures abort the command with actionable error messages. Fix the issues and re-run.

Common Workflows

Initial Setup

Set up GitOps for an existing environment:
# Initialize Git repository
mkdir kodexa-infra && cd kodexa-infra
git init

# Set environment variables for authentication
export KODEXA_DEV_API_KEY="your-dev-api-key"
export KODEXA_PROD_API_KEY="your-prod-api-key"

# Create sync configuration
cat > sync-config.yaml << 'EOF'
schema_version: "2.0"

environments:
  dev:
    url: https://dev.kodexa.ai
    api_key_env: KODEXA_DEV_API_KEY

  prod:
    url: https://kodexa.ai
    api_key_env: KODEXA_PROD_API_KEY

targets:
  my-org-dev:
    organization: my-org
    manifests:
      - manifests/main.yaml

branch_mappings:
  - pattern: "main"
    target: my-org-dev
    environment: dev
EOF

# Create manifest
mkdir -p manifests
cat > manifests/main.yaml << 'EOF'
manifest_version: "1.0"

resources:
  data-definitions:
    - invoice
    - contract

  modules:
    - invoice-processor
EOF

# Pull existing resources from dev
kdx sync pull --target my-org-dev --env dev

# Commit to version control
git add .
git commit -m "Initial metadata snapshot from dev environment"
git remote add origin git@github.com:company/kodexa-infra.git
git push -u origin main

Development Workflow

Make and test changes locally:
# Create feature branch
git checkout -b feature/add-new-taxonomy

# Pull latest metadata from dev
kdx sync pull --target my-org-dev --env dev

# Make changes to resources
vim kodexa-resources/data-definitions/new-taxonomy.yaml

# Update manifest to include new resource
vim manifests/main.yaml  # Add 'new-taxonomy' to data-definitions

# Validate changes locally
kdx sync push --target my-org-dev --env dev --dry-run

# Push to dev for testing
kdx sync push --target my-org-dev --env dev

# If validation passes, commit
git add .
git commit -m "Add new document classification taxonomy"
git push origin feature/add-new-taxonomy

Promotion Workflow (Multi-Environment)

Promote changes across environments using branch mappings:
# Configure multi-environment setup
cat > sync-config.yaml << 'EOF'
schema_version: "2.0"

environments:
  dev:
    url: https://dev.kodexa.ai
    api_key_env: KODEXA_DEV_API_KEY

  staging:
    url: https://staging.kodexa.ai
    api_key_env: KODEXA_STAGING_API_KEY

  prod:
    url: https://kodexa.ai
    api_key_env: KODEXA_PROD_API_KEY

targets:
  my-org:
    organization: my-org
    manifests:
      - manifests/main.yaml

branch_mappings:
  - pattern: "feature/*"
    target: my-org
    environment: dev

  - pattern: "release/*"
    target: my-org
    environment: staging

  - pattern: "main"
    target: my-org
    environment: prod
EOF

# Development on feature branch
git checkout -b feature/enhancements
# ... make changes ...
kdx sync deploy  # Auto-deploys to dev

# Create release branch for staging
git checkout -b release/v1.0
kdx sync deploy  # Auto-deploys to staging

# After validation, merge to main
git checkout main
git merge release/v1.0
kdx sync deploy  # Auto-deploys to prod

Multi-Tenant Deployment

Deploy to multiple organizations from one repository:
# sync-config.yaml
schema_version: "2.0"

environments:
  prod:
    url: https://kodexa.ai
    api_key_env: KODEXA_PROD_API_KEY

targets:
  shared:
    organization: acme-shared
    manifests:
      - manifests/shared.yaml

  division-a:
    organization: acme-division-a
    manifests:
      - manifests/shared.yaml
      - manifests/division-a.yaml

  division-b:
    organization: acme-division-b
    manifests:
      - manifests/shared.yaml
      - manifests/division-b.yaml

branch_mappings:
  - pattern: "main"
    targets:
      - target: shared
      - target: division-a
      - target: division-b
    environment: prod
# Deploy all three organizations with one command
kdx sync deploy

# Or deploy specific division
kdx sync deploy --target division-a --env prod

Pull Request Review

# Reviewer checks out branch
git checkout feature/add-new-taxonomy

# Validate locally without deploying
kdx sync deploy --dry-run

# Review changes
git diff main...feature/add-new-taxonomy

# Approve and merge if valid

Rollback Changes

# Revert to previous version
git revert HEAD

# Or reset to specific commit
git reset --hard abc1234

# Push reverted state back to environment
kdx sync deploy --env prod --target my-org

Advanced Usage

Manifest Composition

Combine multiple manifests for flexible resource grouping:
targets:
  division-a:
    organization: acme-division-a
    manifests:
      - manifests/shared.yaml       # Core resources used by all
      - manifests/common-types.yaml # Common to division A
      - manifests/division-a.yaml   # Division A specific
This allows:
  • Shared resources defined once, reused across divisions
  • Team-specific resources isolated
  • Easy addition/removal of resources

Environment-Specific Branch Mappings

Use branch patterns to control deployment destinations:
branch_mappings:
  # Main branch → production only
  - pattern: "main"
    targets:
      - target: shared
        environment: prod
      - target: division-a
        environment: prod

  # Release branches → staging for all
  - pattern: "release/*"
    targets:
      - target: shared
      - target: division-a
    environment: staging

  # Feature branches → dev, but only specific division
  - pattern: "feature/div-a/*"
    target: division-a
    environment: dev

  - pattern: "feature/div-b/*"
    target: division-b
    environment: dev

CI/CD Integration

Automate sync using GitHub Actions:
name: Deploy to Kodexa

on:
  push:
    branches: [main, 'release/*', 'feature/*']

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Install KDX CLI
        run: |
          # Install via homebrew or download binary
          brew install kodexa-ai/tap/kdx

      - name: Deploy
        env:
          KODEXA_DEV_API_KEY: ${{ secrets.KODEXA_DEV_API_KEY }}
          KODEXA_STAGING_API_KEY: ${{ secrets.KODEXA_STAGING_API_KEY }}
          KODEXA_PROD_API_KEY: ${{ secrets.KODEXA_PROD_API_KEY }}
        run: |
          kdx sync deploy --confirm-all
Benefits of this approach:
  • Single workflow for all environments
  • Branch-based automatic routing
  • No profile setup needed
  • Clean secrets management via environment variables

Resource Path Overrides

Control where resource files are stored:
# manifest.yaml
manifest_version: "1.0"

resources:
  # Standard location: kodexa-resources/data-definitions/standard.yaml
  - type: data-definition
    slug: standard

  # Custom location
  - type: data-definition
    slug: special
    path: custom-dir/special-taxonomy.yaml

  # Absolute path
  - type: store
    slug: central-store
    path: /shared/stores/central.yaml

Troubleshooting

”Environment not found”

Ensure the environment is defined in sync-config.yaml:
# Check your configuration
cat sync-config.yaml | grep -A 3 "environments:"

# Verify environment name matches exactly (case-sensitive)
Resolution:
environments:
  dev:  # Must match --env value exactly
    url: https://dev.kodexa.ai
    api_key_env: KODEXA_DEV_API_KEY

“API key environment variable not set”

The referenced environment variable must be exported:
# Check if variable is set
echo $KODEXA_DEV_API_KEY

# Set it in your shell
export KODEXA_DEV_API_KEY="your-api-key-here"

# Or add to your shell profile (~/.bashrc or ~/.zshrc)
echo 'export KODEXA_DEV_API_KEY="your-api-key"' >> ~/.bashrc

“Target not found”

Verify target is defined in sync-config.yaml:
# Check targets section
cat sync-config.yaml | grep -A 5 "targets:"
Resolution:
targets:
  my-org-dev:  # Must match --target value exactly
    organization: my-org
    manifests:
      - manifests/main.yaml

“No branch mapping found”

Either specify target/environment manually or add a branch mapping:
# Manual override (no branch mapping needed)
kdx sync deploy --target my-org-dev --env dev

# Or add branch mapping
Resolution:
branch_mappings:
  - pattern: "feature/test"  # Exact match
    target: my-org-dev
    environment: dev

  - pattern: "feature/*"     # Wildcard pattern
    target: my-org-dev
    environment: dev

API Authentication Failed

Check your API key is correct and has proper permissions:
# Verify environment variable is set
echo $KODEXA_DEV_API_KEY

# Test API access directly
curl -H "Authorization: Bearer $KODEXA_DEV_API_KEY" \
  https://dev.kodexa.ai/api/workspaces

# Update API key if needed
export KODEXA_DEV_API_KEY="new-api-key"

Changes Not Syncing

Clear cache and verify resource files exist:
# Check resource files exist
ls -la kodexa-resources/data-definitions/

# Verify manifest references correct slugs
cat manifests/main.yaml

# Check for validation errors
kdx sync push --target my-org --env dev --dry-run

Merge Conflicts

Resolve conflicts manually:
# Pull latest from environment
kdx sync pull --target my-org --env prod

# Resolve conflicts in YAML files
vim kodexa-resources/data-definitions/conflicted-file.yaml

# Test resolution
kdx sync push --target my-org --env prod --dry-run

# Commit resolution
git add .
git commit -m "Resolve merge conflict in taxonomy"

Best Practices

1. Use Dry Run First

Always validate before pushing:
kdx sync deploy --dry-run
# Review output
kdx sync deploy --confirm-all

2. Commit After Each Pull

Track what changed in the remote environment:
kdx sync pull --target my-org --env prod
git add .
git commit -m "Sync from prod: $(date)"

3. Small, Focused Changes

Make incremental changes rather than large batches:
# Good: one taxonomy at a time
git commit -m "Add invoice classification taxonomy"

# Avoid: many unrelated changes
git commit -m "Update all taxonomies, stores, and projects"

4. Test in Lower Environments

Test changes in dev/staging before production:
# Test in dev
kdx sync push --target my-org --env dev
# ... test ...

# Promote to staging
kdx sync push --target my-org --env staging
# ... test ...

# Finally production
kdx sync deploy --target my-org --env prod

5. Document Dependencies

Add comments to YAML files:
# project.yaml
slug: invoice-automation
name: Invoice Automation
# Depends on:
#   - data-definition: invoice-fields
#   - store: invoice-documents
#   - module: invoice-classifier-v2

6. Use Meaningful Commit Messages

# Good
git commit -m "Add OCR feature instance for production environment"

# Better
git commit -m "feat: add OCR feature instance for production

- Configured with high-accuracy model
- Enabled for invoice-automation project
- Memory limit set to 4GB"

Next Steps