Skip to content

Terraform Style and Safety for TFE-Backed Workspaces

Terraform coding conventions, safety rules, and best practices for Terraform Enterprise (TFE) backed workspaces in Optum environments.

experimental
IDE:
claude
codex
vscode
Version:
1.0.0
Owner:epic-platform-sre
terraform
tfe
style
safety
iac

Terraform Style and Safety Guide

Overview

This guide covers Terraform coding conventions and safety rules for Optum's TFE-backed infrastructure. All changes to production infrastructure MUST flow through Terraform Enterprise and CI/CD pipelines.

Critical Safety Rules

Apply Restrictions

NEVER run terraform apply locally:

# ❌ FORBIDDEN - Local applies bypass all governance
terraform apply

# ❌ FORBIDDEN - Even with auto-approve
# SECURITY: -auto-approve bypasses final safety check and should NEVER
# be scripted in CI/CD or used in any automated context, even for
# non-production environments. It eliminates the last chance to catch
# destructive changes before they execute.
terraform apply -auto-approve

# ✅ ALLOWED - Local planning for development
terraform plan

# ✅ ALLOWED - Local validation
terraform validate

# ✅ ALLOWED - Format checking
terraform fmt -check

All applies MUST go through:

  1. Pull request with plan output
  2. CI/CD pipeline validation
  3. Terraform Enterprise workspace run
  4. Required approvals for production

Destroy Restrictions

NEVER run terraform destroy locally against shared environments:

# ❌ FORBIDDEN - Local destroy
terraform destroy

# ✅ ALLOWED - Destroy through TFE with appropriate approvals
# Submit PR to remove resources, let TFE handle destruction

Project Structure

Standard Layout

MUST organize Terraform projects with this structure:

terraform/
├── modules/                    # Reusable modules
│   └── {module-name}/
│       ├── main.tf
│       ├── variables.tf
│       ├── outputs.tf
│       └── README.md
├── environments/               # Environment-specific configurations
│   ├── dev/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   ├── terraform.tfvars
│   │   └── backend.tf
│   ├── qa/
│   │   └── ...
│   └── prod/
│       └── ...
├── .terraform-version          # Required Terraform version
├── .tflint.hcl                # TFLint configuration
└── README.md

File Organization

MUST organize resources by file:

FileContents
main.tfPrimary resources and module calls
variables.tfAll variable declarations
outputs.tfAll output declarations
locals.tfLocal value definitions
providers.tfProvider configurations
backend.tfBackend configuration
data.tfData source lookups
versions.tfVersion constraints

Naming Conventions

Resource Naming

MUST follow this naming pattern:

# Pattern: {org}-{env}-{region}-{type}-{purpose}
# Example: ohemr-prod-eastus2-rg-platform

resource "azurerm_resource_group" "platform" {
  name     = "${var.org}-${var.environment}-${var.location}-rg-platform"
  location = var.location
}

resource "azurerm_storage_account" "logs" {
  # Storage accounts: 3-24 chars, lowercase alphanumeric only
  name                = "${var.org}${var.environment}${var.location_short}salogs"
  resource_group_name = azurerm_resource_group.platform.name
  location            = azurerm_resource_group.platform.location
  # ...
}

Variable Naming

MUST use snake_case for variables:

# ✅ Good - snake_case with descriptive names
variable "resource_group_name" {
  description = "Name of the resource group"
  type        = string
}

variable "enable_diagnostic_logs" {
  description = "Enable diagnostic logging to Log Analytics"
  type        = bool
  default     = true
}

# ❌ Bad - camelCase or ambiguous
variable "resourceGroupName" { }
variable "enable_logs" { }  # Too vague

Module Usage

Registry Modules

PREFER modules from the ohemr-epic private registry:

# ✅ Preferred - Private registry module
module "aks_cluster" {
  source  = "app.terraform.io/ohemr-epic/aks/azurerm"
  version = "~> 2.0"

  cluster_name        = local.cluster_name
  resource_group_name = azurerm_resource_group.platform.name
  # ...
}

# ⚠️ Acceptable - Public registry with version pin
module "naming" {
  source  = "Azure/naming/azurerm"
  version = "0.4.0"  # MUST pin version
  # ...
}

# ❌ Avoid - Git source without version
module "custom" {
  source = "git::https://github.com/example/module.git"
  # Missing version tag
}

Module Versioning

MUST pin module versions:

# ✅ Good - Pessimistic constraint
module "network" {
  source  = "app.terraform.io/ohemr-epic/network/azurerm"
  version = "~> 3.0"  # Allows 3.x but not 4.0
}

# ✅ Good - Exact version for critical modules
module "database" {
  source  = "app.terraform.io/ohemr-epic/sql/azurerm"
  version = "= 2.1.5"  # Exact version
}

# ❌ Bad - No version constraint
module "storage" {
  source = "app.terraform.io/ohemr-epic/storage/azurerm"
  # Missing version = dangerous
}

Variable Documentation

Required Documentation

MUST document all variables:

variable "environment" {
  description = "Deployment environment (dev, qa, prod)"
  type        = string

  validation {
    condition     = contains(["dev", "qa", "prod"], var.environment)
    error_message = "Environment must be dev, qa, or prod."
  }
}

variable "vm_size" {
  description = <<-EOT
    Azure VM size for worker nodes.

    Recommended sizes by environment:
    - dev: Standard_D2s_v3 (2 vCPU, 8 GB)
    - qa: Standard_D4s_v3 (4 vCPU, 16 GB)
    - prod: Standard_D8s_v3 (8 vCPU, 32 GB)
  EOT
  type        = string
  default     = "Standard_D2s_v3"
}

variable "tags" {
  description = "Resource tags applied to all resources"
  type        = map(string)
  default     = {}

  validation {
    condition     = can(var.tags["environment"])
    error_message = "Tags must include 'environment' key."
  }
}

Output Documentation

MUST document all outputs:

output "resource_group_id" {
  description = "The ID of the created resource group"
  value       = azurerm_resource_group.main.id
}

output "storage_account_primary_connection_string" {
  description = "Primary connection string for the storage account"
  value       = azurerm_storage_account.main.primary_connection_string
  sensitive   = true  # MUST mark sensitive outputs
}

Security Best Practices

Secrets Management

NEVER hardcode secrets:

# ❌ FORBIDDEN - Hardcoded secrets
resource "azurerm_key_vault_secret" "api_key" {
  name         = "api-key"
  value        = "sk-1234567890abcdef"  # NEVER do this
  key_vault_id = azurerm_key_vault.main.id
}

# ✅ CORRECT - Reference from Key Vault
data "azurerm_key_vault_secret" "api_key" {
  name         = "api-key"
  key_vault_id = data.azurerm_key_vault.main.id
}

# ✅ CORRECT - Use sensitive variables (set via TFE)
variable "api_key" {
  description = "API key for external service"
  type        = string
  sensitive   = true
}

Identity Management

PREFER managed identities over service principals:

# ✅ Preferred - System-assigned managed identity
resource "azurerm_kubernetes_cluster" "main" {
  name                = local.cluster_name
  location            = azurerm_resource_group.main.location
  resource_group_name = azurerm_resource_group.main.name

  identity {
    type = "SystemAssigned"
  }
  # ...
}

# ✅ Acceptable - User-assigned managed identity
resource "azurerm_user_assigned_identity" "aks" {
  name                = "${local.cluster_name}-identity"
  resource_group_name = azurerm_resource_group.main.name
  location            = azurerm_resource_group.main.location
}

# ⚠️ Avoid - Service principal (legacy)
# Only use when managed identity is not supported

Network Security

MUST implement network security:

# ✅ Good - Private endpoints for PaaS services
resource "azurerm_private_endpoint" "storage" {
  name                = "${local.storage_name}-pe"
  location            = azurerm_resource_group.main.location
  resource_group_name = azurerm_resource_group.main.name
  subnet_id           = azurerm_subnet.private_endpoints.id

  private_service_connection {
    name                           = "${local.storage_name}-psc"
    private_connection_resource_id = azurerm_storage_account.main.id
    subresource_names              = ["blob"]
    is_manual_connection           = false
  }
}

# ✅ Good - Disable public access
resource "azurerm_storage_account" "main" {
  name                          = local.storage_name
  resource_group_name           = azurerm_resource_group.main.name
  location                      = azurerm_resource_group.main.location
  account_tier                  = "Standard"
  account_replication_type      = "GRS"

  public_network_access_enabled = false  # MUST disable for prod
  min_tls_version              = "TLS1_2"

  network_rules {
    default_action = "Deny"
    bypass         = ["AzureServices"]
  }
}

Required Tags

MUST include required tags on all resources:

locals {
  required_tags = {
    environment  = var.environment
    owner        = var.owner
    cost_center  = var.cost_center
    application  = var.application_name
    created_by   = "terraform"
    repository   = var.repository_url
  }
}

resource "azurerm_resource_group" "main" {
  name     = local.resource_group_name
  location = var.location
  tags     = local.required_tags
}

# Apply to all resources
resource "azurerm_storage_account" "main" {
  # ...
  tags = merge(local.required_tags, {
    data_classification = "internal"
  })
}

Monitoring and Logging

MUST enable monitoring:

# ✅ Required - Diagnostic settings for all resources
resource "azurerm_monitor_diagnostic_setting" "storage" {
  name                       = "${local.storage_name}-diag"
  target_resource_id         = azurerm_storage_account.main.id
  log_analytics_workspace_id = data.azurerm_log_analytics_workspace.main.id

  enabled_log {
    category = "StorageRead"
  }

  enabled_log {
    category = "StorageWrite"
  }

  enabled_log {
    category = "StorageDelete"
  }

  metric {
    category = "Transaction"
    enabled  = true
  }
}

# ✅ Required - Activity log alerts for critical changes
resource "azurerm_monitor_activity_log_alert" "resource_deletion" {
  name                = "resource-deletion-alert"
  resource_group_name = azurerm_resource_group.main.name
  scopes              = [data.azurerm_subscription.current.id]

  criteria {
    operation_name = "Microsoft.Resources/subscriptions/resourceGroups/delete"
    category       = "Administrative"
  }

  action {
    action_group_id = azurerm_monitor_action_group.critical.id
  }
}

State Management

Backend Configuration

MUST use remote backend:

# backend.tf
terraform {
  backend "remote" {
    hostname     = "app.terraform.io"
    organization = "ohemr-epic"

    workspaces {
      name = "platform-${var.environment}"
    }
  }
}

State Locking

State locking is automatic with TFE. NEVER disable state locking:

# ❌ FORBIDDEN - Disabling state locking
terraform {
  backend "azurerm" {
    # ...
    lock = false  # NEVER do this
  }
}

Code Review Checklist

When reviewing Terraform changes, verify:

Security

  • No hardcoded secrets or credentials
  • Using managed identities where possible
  • Network security rules are restrictive (no 0.0.0.0/0)
  • Private endpoints for PaaS services
  • TLS 1.2 minimum enforced
  • Encryption at rest enabled

Compliance

  • Required tags present on all resources
  • Diagnostic logging enabled
  • Resource naming follows conventions
  • Module versions pinned

Safety

  • No local applies or destroys
  • Blast radius is acceptable
  • Rollback plan documented for major changes
  • No breaking changes to existing resources

Quality

  • Variables documented with descriptions
  • Outputs documented
  • terraform fmt applied
  • terraform validate passes
  • TFLint passes with no errors

Common Patterns

Conditional Resource Creation

variable "enable_backup" {
  description = "Enable backup vault"
  type        = bool
  default     = true
}

resource "azurerm_recovery_services_vault" "main" {
  count               = var.enable_backup ? 1 : 0
  name                = "${local.prefix}-rsv"
  location            = azurerm_resource_group.main.location
  resource_group_name = azurerm_resource_group.main.name
  sku                 = "Standard"
}

Dynamic Blocks

variable "storage_containers" {
  description = "List of storage containers to create"
  type = list(object({
    name        = string
    access_type = string
  }))
  default = []
}

resource "azurerm_storage_container" "containers" {
  for_each              = { for c in var.storage_containers : c.name => c }
  name                  = each.value.name
  storage_account_name  = azurerm_storage_account.main.name
  container_access_type = each.value.access_type
}

Data Lookups

# Look up existing resources
data "azurerm_subscription" "current" {}

data "azurerm_client_config" "current" {}

data "azurerm_key_vault" "shared" {
  name                = "ohemr-${var.environment}-kv-shared"
  resource_group_name = "ohemr-${var.environment}-rg-shared"
}

# Use in resources
resource "azurerm_key_vault_access_policy" "app" {
  key_vault_id = data.azurerm_key_vault.shared.id
  tenant_id    = data.azurerm_client_config.current.tenant_id
  object_id    = azurerm_user_assigned_identity.app.principal_id

  secret_permissions = ["Get", "List"]
}

Related Assets