Table of Contents

Dynamic Configuration Using YAML/JSON Files

One of the most powerful yet underutilized features in Terraform is the ability to use external YAML or JSON files as input for dynamic resource creation. This technique, recommended by Google, offers a more flexible and maintainable approach to infrastructure management.

The Basic Pattern

When managing multiple environments or components, you can organize configurations into separate YAML files:

.
├── terraform/
│   ├── main.tf
│   └── environments/
│       ├── prod/
│       │   ├── storage.yaml
│       │   └── compute.yaml
│       └── staging/
│           ├── storage.yaml
│           └── compute.yaml

Here’s how the YAML files might look:

# environments/prod/storage.yaml
# buckets.yaml
buckets:
  - name: production-logs
    location: US
    storage_class: STANDARD
    versioning: true
    lifecycle_rules:
      - action: Delete
        age: 90
  
  - name: backup-archive
    location: EU
    storage_class: COLDLINE
    versioning: false
    lifecycle_rules:
      - action: Delete
        age: 365

And the terraform that consumes the *.yaml files:

# main.tf
locals {
  environments = toset(["prod", "staging"])
  
  # Find latest config file for each environment
  latest_storage_configs = {
    for env in local.environments : env => yamldecode(
      file(
        sort(fileset("${path.module}/environments/${env}", "storage_*.yaml"))[-1]
      )
    )
  }
}

resource "google_storage_bucket" "buckets" {
  for_each = merge([
    for env, config in local.storage_configs : {
      for bucket in config.buckets : "${env}-${bucket.name}-${config.timestamp}" => merge(bucket, {
        environment = env,
        deployment_timestamp = config.timestamp
      })
    }
  ]...)

  # Add a precondition to verify unique timestamps
  lifecycle {
    precondition {
      condition     = alltrue(local.check_timestamps)
      error_message = "All configuration files must have unique timestamps"
    }
  }

  name          = each.value.name
  location      = each.value.location
  storage_class = each.value.storage_class

  versioning {
    enabled = each.value.versioning
  }

  dynamic "lifecycle_rule" {
    for_each = each.value.lifecycle_rules
    content {
      condition {
        age = lifecycle_rule.value.age
      }
      action {
        type = lifecycle_rule.value.action
      }
    }
  }
}

Advantages of YAML Terraform Inputs

Separation of Configuration and Logic: Configuration details are isolated from Terraform code, making both easier to maintain.
Version Control Friendly: Changes to configurations are clearly visible in version control.
Programmatic Updates: Easy to integrate with automation tools or CI/CD pipelines.
Reduced Error Risk: Compared to maintaining large locals blocks, YAML provides better structure and validation.

Challenges and Solutions of YAML Terraform Inputs

Input Validation

You can implement validation in two ways:

Using Preconditions:

resource "google_storage_bucket" "buckets" {
  for_each = { for bucket in local.bucket_config.buckets : bucket.name => bucket }

  lifecycle {
    precondition {
      condition = contains(["US", "EU", "ASIA"], each.value.location)
      error_message = "Location must be one of: US, EU, ASIA"
    }

    precondition {
      condition = contains(["STANDARD", "NEARLINE", "COLDLINE"], each.value.storage_class)
      error_message = "Invalid storage class specified"
    }
  }
  # ... rest of resource configuration
}

Using Variable Validation:

variable "bucket_config" {
  type = object({
    buckets = list(object({
      name          = string
      location      = string
      storage_class = string
      versioning    = bool
      lifecycle_rules = list(object({
        action = string
        age    = number
      }))
    }))
  })

  validation {
    condition = alltrue([
      for bucket in var.bucket_config.buckets :
      contains(["US", "EU", "ASIA"], bucket.location)
    ])
    error_message = "Invalid location specified in bucket configuration"
  }

  validation {
    condition = alltrue([
      for bucket in var.bucket_config.buckets :
      contains(["STANDARD", "NEARLINE", "COLDLINE"], bucket.storage_class)
    ])
    error_message = "Invalid storage class specified in bucket configuration"
  }
}

User Interface Considerations

While YAML/JSON configuration offers flexibility, it can be challenging for users unfamiliar with these formats. Consider implementing:

A web form that generates valid YAML/JSON
CLI tools with interactive prompts
Schema documentation with examples
Pre-commit hooks for validation

Best Practices

Always include a schema file (e.g., JSON Schema) to document the expected structure
Implement comprehensive validation at both the variable and resource level
Provide example configurations for common use cases
Consider building helper tools for configuration generation
Use descriptive error messages in validations to guide users

This pattern becomes particularly valuable in large-scale infrastructure where configuration changes are frequent and need to be managed systematically.

Schema Validation for Configuration Files

While YAML/JSON configurations offer flexibility, they also introduce potential errors. Implementing schema validation in your CI/CD pipeline can catch issues before they reach your Terraform workflow.

Creating a JSON Schema

First, define a schema that describes your expected configuration structure:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "required": ["buckets"],
  "properties": {
    "buckets": {
      "type": "array",
      "items": {
        "type": "object",
        "required": ["name", "location", "storage_class", "versioning"],
        "properties": {
          "name": {
            "type": "string",
            "pattern": "^[a-z0-9][-a-z0-9_.]*[a-z0-9]$",
            "maxLength": 63,
            "description": "Bucket name must be between 3-63 chars, start/end with number or letter"
          },
          "location": {
            "type": "string",
            "enum": ["US", "EU", "ASIA"],
            "description": "Geographic location of the bucket"
          },
          "storage_class": {
            "type": "string",
            "enum": ["STANDARD", "NEARLINE", "COLDLINE"],
            "description": "Storage class affecting availability and cost"
          },
          "versioning": {
            "type": "boolean",
            "description": "Whether object versioning is enabled"
          },
          "lifecycle_rules": {
            "type": "array",
            "items": {
              "type": "object",
              "required": ["action", "age"],
              "properties": {
                "action": {
                  "type": "string",
                  "enum": ["Delete", "SetStorageClass"],
                  "description": "Action to take on matching objects"
                },
                "age": {
                  "type": "integer",
                  "minimum": 1,
                  "description": "Age in days when the action should occur"
                }
              }
            }
          }
        }
      }
    }
  }
}

Implementing GitHub Workflow Validation

Create a GitHub workflow to validate YAML files against your schema:

name: Validate Terraform YAML Configs

on:
  pull_request:
    paths:
      - 'terraform/environments/**/*.yaml'

jobs:
  validate-yaml:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      
      - name: Install ajv-cli
        run: npm install -g ajv-cli
      
      - name: Validate YAML files
        run: |
          for file in $(find terraform/environments -name "*.yaml"); do
            echo "Validating $file..."
            # Convert YAML to JSON for validation
            yq eval -o=json "$file" > temp.json
            
            # Validate against schema
            if ! ajv validate -s schema.json -d temp.json; then
              echo "❌ Validation failed for $file"
              exit 1
            fi
            rm temp.json
          done
        
      - name: Comment on PR
        if: failure()
        uses: actions/github-script@v6
        with:
          script: |
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.name,
              body: '❌ YAML validation failed. Please check the workflow logs and ensure your configuration matches the schema.'
            })

This workflow:

Triggers on YAML file changes
Converts YAML to JSON
Validates against schema
Fails PR if validation fails

Benefits of Schema Validation

Catch Errors Early: Validation happens before Terraform runs
Self-Documenting: Schema serves as configuration documentation
IDE Integration: Many editors can validate against JSON schema
Standardization: Enforces consistent configuration patterns

Unit Testing YAML Configurations with Git Hooks

While schema validation in CI/CD provides a safety net, catching configuration errors earlier in the development cycle saves time and reduces failed pipelines. Git pre-commit hooks offer an elegant solution for “unit testing” YAML configurations before they even reach your repository.

Implementing Pre-commit Validation

Create a pre-commit hook to validate YAML files against your schema:

#!/bin/bash
# File: .git/hooks/pre-commit

# Check if yaml files are being committed
yaml_files=$(git diff --cached --name-only --diff-filter=ACMR | grep "^terraform/environments/.*\.yaml$")

if [ -z "$yaml_files" ]; then
    exit 0
fi

# Ensure required tools are installed
command -v yq >/dev/null 2>&1 || { echo "yq is required but not installed. Aborting." >&2; exit 1; }
command -v ajv >/dev/null 2>&1 || { echo "ajv-cli is required but not installed. Aborting." >&2; exit 1; }

echo "Validating YAML files against schema..."

# Validate each yaml file
for file in $yaml_files; do
    if [ -f "$file" ]; then
        echo "Checking $file..."
        if ! yq eval -o=json "$file" | ajv validate -s schema.json; then
            echo "❌ $file failed schema validation"
            exit 1
        fi
    fi
done

Setting Up the Testing Environment

Save the hook as .git/hooks/pre-commit
Make it executable: chmod +x .git/hooks/pre-commit
Install dependencies:

npm install -g ajv-cli
brew install yq  # or apt-get install yq for Linux

Benefits of Pre-commit Testing

Instant Feedback: Developers get immediate validation results
Offline Validation: No need for CI pipeline to catch basic errors
Reduced Pipeline Load: Catch errors before they trigger CI/CD runs
Standardized Testing: Every developer uses the same validation rules

Best Practices

Include setup instructions in your repository’s README
Version control your schema file
Consider using pre-commit framework for more complex validations
Keep schema and hooks in sync with Terraform validation rules

Battle-Tested Terraform Techniques: Advanced Patterns for Scaling Cloud Infrastructure

Dynamic Configuration Using YAML/JSON Files

The Basic Pattern

Advantages of YAML Terraform Inputs

Challenges and Solutions of YAML Terraform Inputs

Input Validation

User Interface Considerations

Best Practices

Schema Validation for Configuration Files

Creating a JSON Schema

Implementing GitHub Workflow Validation

Benefits of Schema Validation

Unit Testing YAML Configurations with Git Hooks

Implementing Pre-commit Validation

Setting Up the Testing Environment

Benefits of Pre-commit Testing

Best Practices

Leave a Reply Cancel reply