
AWS CDK vs. Terraform - An Honest Take from Real-World Experience
Nicolai Lang
AWS Serverless expert. Advises and supports teams in building scalable cloud architectures.
If you're doing Infrastructure as Code on AWS, you'll eventually face this question: AWS CDK or Terraform? It comes up in forums, at conferences, in every other architecture review - the debate never gets old. And the answer usually sounds the same: "It depends."
That's true, of course. But it doesn't help when you actually need to make a decision.
We've used both. We started with the Serverless Framework, then evaluated and adopted Terraform, and eventually landed on AWS CDK. Terraform is a great tool - we enjoyed working with it and it did its job well. But CDK turned out to be the better fit for what we build. This article isn't a neutral feature comparison. It's an honest assessment from practice - with a clear perspective and respect for the strengths of both tools.
Two Different Mindsets
The difference between CDK and Terraform isn't "Feature A vs. Feature B." They represent two fundamentally different approaches to thinking about Infrastructure as Code.
Terraform is declarative. You use HCL - HashiCorp's own configuration language - to describe the desired state of your infrastructure. Terraform compares that desired state with what actually exists and figures out what needs to change. It calls the AWS APIs directly - no CloudFormation, no intermediary. And it works across clouds: AWS, Azure, GCP, even GitHub or Datadog can be managed through providers.
CDK is something different: Infrastructure as real code. No YAML, no JSON, no HCL - just TypeScript, Python, Java, or Go. A real programming language with variables, functions, classes, types, and an entire ecosystem of tooling. CDK generates CloudFormation templates from your code and deploys them through the AWS CloudFormation service.
With Terraform, your infrastructure code is configuration. With CDK, it's software. We covered what that means in practice in Infrastructure as Code with AWS CDK.
Code and Abstraction
Let's look at the same serverless setup in both tools: a Lambda function that reacts to events from an SQS queue and writes results to DynamoDB - once in Terraform, once in CDK:
# SQS Queue
resource "aws_sqs_queue" "event_queue" {
name = "EventQueue"
}
# DynamoDB Table
resource "aws_dynamodb_table" "results" {
name = "Results"
billing_mode = "PAY_PER_REQUEST"
hash_key = "id"
attribute { name = "id"; type = "S" }
}
# IAM Role for Lambda
resource "aws_iam_role" "lambda_role" {
name = "processor-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Principal = { Service = "lambda.amazonaws.com" }
Action = "sts:AssumeRole"
}]
})
}
# CloudWatch Logs
resource "aws_iam_role_policy_attachment" "logs" {
role = aws_iam_role.lambda_role.name
policy_arn = "arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole"
}
# SQS Permissions
resource "aws_iam_role_policy" "sqs_policy" {
role = aws_iam_role.lambda_role.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Action = [
"sqs:ReceiveMessage",
"sqs:DeleteMessage",
"sqs:ChangeMessageVisibility",
"sqs:GetQueueAttributes"
]
Resource = aws_sqs_queue.event_queue.arn
}]
})
}
# DynamoDB Permissions
resource "aws_iam_role_policy" "dynamodb_policy" {
role = aws_iam_role.lambda_role.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Action = [
"dynamodb:PutItem",
"dynamodb:UpdateItem"
]
Resource = aws_dynamodb_table.results.arn
}]
})
}
# Lambda Function
resource "aws_lambda_function" "processor" {
function_name = "Processor"
runtime = "nodejs20.x"
handler = "index.handler"
filename = "lambda.zip"
role = aws_iam_role.lambda_role.arn
}
# SQS -> Lambda Trigger
resource "aws_lambda_event_source_mapping" "sqs_trigger" {
event_source_arn = aws_sqs_queue.event_queue.arn
function_name = aws_lambda_function.processor.arn
function_response_types = ["ReportBatchItemFailures"]
}// SQS Queue
const queue = new sqs.Queue(this, "EventQueue");
// DynamoDB Table
const table = new dynamodb.Table(this, "Results", {
partitionKey: { name: "id", type: dynamodb.AttributeType.STRING },
});
// Lambda Function
const fn = new lambda.Function(this, "Processor", {
runtime: lambda.Runtime.NODEJS_20_X,
handler: "index.handler",
code: lambda.Code.fromAsset("lambda"),
});
// SQS -> Lambda Trigger + IAM + DynamoDB Permissions
fn.addEventSource(
new SqsEventSource(queue, {
reportBatchItemFailures: true,
}),
);
table.grantWriteData(fn);# SQS Queue
resource "aws_sqs_queue" "event_queue" {
name = "EventQueue"
}
# DynamoDB Table
resource "aws_dynamodb_table" "results" {
name = "Results"
billing_mode = "PAY_PER_REQUEST"
hash_key = "id"
attribute { name = "id"; type = "S" }
}
# IAM Role for Lambda
resource "aws_iam_role" "lambda_role" {
name = "processor-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Principal = { Service = "lambda.amazonaws.com" }
Action = "sts:AssumeRole"
}]
})
}
# CloudWatch Logs
resource "aws_iam_role_policy_attachment" "logs" {
role = aws_iam_role.lambda_role.name
policy_arn = "arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole"
}
# SQS Permissions
resource "aws_iam_role_policy" "sqs_policy" {
role = aws_iam_role.lambda_role.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Action = [
"sqs:ReceiveMessage",
"sqs:DeleteMessage",
"sqs:ChangeMessageVisibility",
"sqs:GetQueueAttributes"
]
Resource = aws_sqs_queue.event_queue.arn
}]
})
}
# DynamoDB Permissions
resource "aws_iam_role_policy" "dynamodb_policy" {
role = aws_iam_role.lambda_role.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Action = [
"dynamodb:PutItem",
"dynamodb:UpdateItem"
]
Resource = aws_dynamodb_table.results.arn
}]
})
}
# Lambda Function
resource "aws_lambda_function" "processor" {
function_name = "Processor"
runtime = "nodejs20.x"
handler = "index.handler"
filename = "lambda.zip"
role = aws_iam_role.lambda_role.arn
}
# SQS -> Lambda Trigger
resource "aws_lambda_event_source_mapping" "sqs_trigger" {
event_source_arn = aws_sqs_queue.event_queue.arn
function_name = aws_lambda_function.processor.arn
function_response_types = ["ReportBatchItemFailures"]
}Which one appeals to you more? In Terraform, you declare every resource, every permission, and every
connection individually. You need to know exactly which IAM actions an SQS event source mapping
requires and what the policy looks like. In CDK, addEventSource() tells the framework what you
want, and CDK handles the rest: role, policies, trigger - all with the right permissions.
And CDK goes further: with NodejsFunction, you can bake the build step right into your
infrastructure definition. Compiling TypeScript, bundling dependencies, creating the artifact - it
all happens automatically during cdk deploy. No separate build step, no manually maintained ZIP
file. That simplifies getting started and adds up quickly as your infrastructure grows.
But the real difference runs deeper. CDK uses a real programming language, and with that you get everything that makes software projects tick: abstraction through classes and functions. Extracting shared patterns into reusable constructs. Building libraries your team can share. Writing unit tests that verify your infrastructure is correct before you deploy. And through Custom Resources, you can integrate things that CloudFormation doesn't cover natively. That works for individual external resources and edge cases, but it's not a bridge to an entire non-AWS stack.
The flip side: under the hood, CDK generates CloudFormation templates, so you're always working with two layers. That has consequences for deployment speed, state management, and refactoring - which we'll get into next.
Deployment and Error Handling
Terraform deploys faster. It calls the AWS APIs directly - no intermediary CloudFormation step.
CloudFormation is slower, even though it creates independent resources in parallel. On the other
hand, you get automatic rollbacks: if a deployment fails, it reverts to the last stable state. The
rollback is essentially another deployment on top, so when something goes wrong, it costs you double
the time. And in rare cases, CloudFormation gets stuck in the dreaded UPDATE_ROLLBACK_FAILED
state, which can be painful to recover from.
Terraform has no automatic rollback. If an apply gets interrupted mid-way - say, because the CI/CD runner dies - you're left with half-built infrastructure. The state file may not reflect reality, and you have to clean things up yourself.
Two different risk profiles: CloudFormation takes care of things automatically but can be slow or get stuck. Terraform gives you control, but nothing happens without you actively stepping in.
One thing that improved a lot in 2025: CloudFormation now validates templates much better before deployment and catches errors earlier. Annoying rollback causes like invalid properties and constraint violations are now caught before any resources are touched. The old cycle of deploy, wait, facepalm, rollback, start over has become much less common.
Before deploying, you obviously want to see what's going to change. That's especially helpful during
development. terraform plan shows this very clearly and in detail - including resources that will
be destroyed and recreated. On the CDK side, cdk diff provides similar information. Sometimes you
need the Change Sets in the CloudFormation console for the full picture. In my view, the tools are
roughly on par here.
State, Drift, and Refactoring
Both tools manage state - but differently, and that shapes the entire experience.
Terraform manages its own state in a JSON file. On AWS, that file typically lives in S3 so the whole team can access it. To prevent conflicts, you add locking via DynamoDB. That means before you can even get started, you need dedicated infrastructure just for state management. And a corrupt state file or forgotten locking can block deployments or require manual recovery. Bucket versioning is a must.
CDK also has an S3 bucket for artifacts, but it's created automatically through bootstrapping. AWS manages the CloudFormation state itself - corruption isn't a concern.
Drift - when someone manually changes something in the console - is detected automatically by
Terraform on the next plan. With CloudFormation, you have to actively trigger drift detection, and
it doesn't automatically affect your CDK deployment. AWS has added drift-aware Change Sets, but
Terraform's automatic detection remains the cleaner approach.
Terraform also has a slight edge when it comes to refactoring. The moved block makes renames and
moves reviewable in PRs and version-controlled in Git. In CDK, restructuring your code can cause
CloudFormation to want to replace resources instead of updating them. That can mean a lot of manual
work, especially for stateful resources like databases that can't simply be recreated. AWS made a
big step forward in 2025 with cdk refactor, but the feature is still new - and the classic
approach of retain, remove, and import is still needed for some scenarios.
Multi-Cloud
This is probably the most common argument for Terraform. And at the same time, maybe the most overrated one.
If your organization operates across multiple clouds or supplements its cloud infrastructure with various external services, Terraform offers a real advantage: one tool, one language, one workflow for everything. You can manage and update resources across cloud boundaries together. On top of that, there's a massive provider ecosystem covering services like GitHub, PagerDuty, and Datadog. CDK can't match that breadth. Custom Resources can help, but they have their quirks and aren't suited for managing larger external infrastructure.
How relevant this is depends entirely on your setup. For SMBs that typically build on a single cloud, multi-cloud is rarely a real concern. The more common scenario there is a mix of cloud and on-premise, and Terraform usually doesn't help with that either. But if multi-cloud is genuinely relevant for you, the decision is straightforward - Terraform is an excellent choice.
When to Use What
There's no universal answer, but there are clear indicators.
Terraform makes sense when you're managing multiple cloud providers and external services and want
unified tooling for all of them. When you need to bring large amounts of existing infrastructure
under IaC - Terraform has a clear advantage here with the import block and tools like Terraformer.
Or when transparent state management and mature refactoring tooling are particularly important to
you.
CDK makes sense when you're building on AWS and want to take advantage of deep AWS integration. When your team writes TypeScript or Python and infrastructure shouldn't feel like a foreign body in the development process. When you want to keep infrastructure and application in the same project - with shared interfaces, unified deployments, and real tests. And when you prefer the abstraction and reusability that a real programming language offers.
We ended up being more productive with CDK than with Terraform. Not because Terraform is worse - in some dimensions it's clearly better. But because CDK is closer to how we work anyway: TypeScript, serverless, AWS-native. The infrastructure doesn't feel like a separate process - it feels like part of the product. Terraform's strengths carry less weight for us because we use very little outside of AWS and CDK integrates seamlessly into our development workflow. For us, that's the deciding factor.
If you're just getting started with Infrastructure as Code, our article Infrastructure as Code with AWS CDK is a good starting point. And if you're facing this tooling decision, feel free to get in touch - we're happy to share our experience with you.

