Infrastructure as Code with Terraform: Best Practices

Terraform lets you define infrastructure as code — VPCs, databases, Kubernetes clusters, DNS records, all declared in .tf files, version-controlled, and reproducible. But Terraform at scale is a different beast than Terraform for a side project. State management, module design, environment separation, and team workflows all require careful thought.

Global cloud infrastructure — Infrastructure as Code transforms infrastructure from snowflakes into cattle — reproducible, version-controlled, and automated

State Management: Remote, Always

Terraform state files contain your infrastructure's current state, including sensitive data like database passwords and API keys. Never commit state to Git. Use remote state backends: S3 + DynamoDB for locking (AWS), GCS (GCP), or Terraform Cloud. Enable state file encryption at rest.

backend.tf

terraform {
  backend "s3" {
    bucket         = "vaarak-terraform-state"
    key            = "production/networking/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-lock"  # Prevents concurrent modifications
  }
}

Module Design Principles

One module per logical resource group: networking, database, compute, monitoring. Not one module per Terraform resource.
Modules should be opinionated: A 'database' module should create the RDS instance, security group, parameter group, and monitoring alarms — not just the instance.
Use variables for configuration, not for reimplementing AWS. Don't expose every RDS parameter as a variable — expose the decisions that differ between environments.
Pin module versions: Use exact version constraints (version = '2.3.1') in production, not ranges.
Document module interfaces: Every variable and output should have a description. Future-you will thank present-you.

Environment Separation

We use separate directories per environment, not Terraform workspaces. Each environment (dev, staging, production) has its own state file, its own variable values, and can evolve independently. This prevents a terraform apply in dev from accidentally affecting production — a real risk with workspace-based approaches.

Directory structure

infrastructure/
├── modules/
│   ├── networking/
│   ├── database/
│   ├── compute/
│   └── monitoring/
├── environments/
│   ├── dev/
│   │   ├── main.tf        # Module instantiations
│   │   ├── variables.tf   # Environment-specific defaults
│   │   └── backend.tf     # Separate state file
│   ├── staging/
│   │   └── ...
│   └── production/
│       └── ...
└── global/                 # Shared resources (IAM, DNS)
    └── ...

Always run terraform plan before terraform apply, and review the plan carefully. A plan that shows 'destroy and recreate' on your production database is not something you want to discover after applying.

CI/CD for Terraform

Terraform changes should go through the same code review process as application code. Our workflow: open a PR with infrastructure changes, CI runs terraform plan and posts the plan as a PR comment, the team reviews the plan, and after approval, terraform apply runs automatically. This ensures every infrastructure change is reviewed, tested, and auditable.

Terraform is a powerful tool that requires discipline. Invest in module design, state management, and CI/CD early — the complexity of managing infrastructure grows exponentially with scale, and retrofitting good practices onto a messy Terraform codebase is painful.