Why We Stopped Writing Infrastructure Code and Started Describing What We Wanted

We’ve written a lot of infrastructure as code. Hundreds of stacks, thousands of lines describing VPCs, subnets, security groups, IAM roles—CDK, Terraform, CloudFormation, the full ceremony. And it works. It’s repeatable, version-controlled, and battle-tested in production.

But last month we deployed an AI agent to EC2 using a skill we built for OpenClaw, and the results of this new approach have been better than expected.

How We Got Here

It started simply enough. We needed to deploy OpenClaw agents to EC2 across different configurations—different messaging integrations, different instance sizes, different regions. The kind of thing CDK was made for.

So we wrote the CDK. Then we wrote more CDK. A Telegram integration needed different security group rules than an MS Teams setup. Each new configuration added conditionals. Each conditional added edge cases. The stack that was supposed to save us time was becoming its own maintenance burden—and it still couldn’t handle the one thing we actually cared about: knowing the deployment worked.

That’s when we took a different approach. Instead of encoding every possible configuration into infrastructure code, we wrote a deployment skill—a structured set of instructions that an AI assistant could follow. Not a script. Not a template. A skill.

What makes this different from just prompting an AI? A skill is a package. It contains the instructions, but it also contains the files that need to be exact every time—IAM policies, compliance rules, setup scripts. The things that should never be left to non-deterministic generation are saved as artifacts in the skill itself. Security groups have a baseline configuration with guidance for how to adapt it to the use case, with strict rules to prevent vulnerabilities. The AI follows the guidance and applies the files. Consistency where it matters, flexibility everywhere else.

The Problem with Traditional IaC

Here’s an IaC deployment in practice. You need an EC2 instance running an OpenClaw agent with a Telegram bot. Before you write a single line of infrastructure code, you need to:

Read the CDK or Terraform docs for EC2 constructs
Read the OpenClaw docs for agent configuration
Read the Telegram docs for bot token setup
Figure out which VPC to use (or create one)
Configure security groups, IAM roles, key pairs
Write the user data script
Write the stack
Deploy, wait, connect, check if it actually works
It doesn’t. Debug. Redeploy. Repeat.

That’s not infrastructure-as-code. That’s infrastructure-as-homework.

CDK and Terraform are powerful tools, but they demand that you know every answer before you ask the question. The configuration is rigid. A YAML file or a TypeScript class doesn’t adapt when your VPC has an unexpected CIDR conflict or your AMI doesn’t include the right Node.js version.

What a Skill Does Differently

A deployment skill is a set of instructions that an AI tool follows—not a script it executes blindly. The difference matters. And because it’s just structured guidance, it works with whatever AI tool you prefer: Cursor, Claude Code, OpenClaw, or any agentic tool capable of using agent skills.

When you hand an AI assistant our deploy skill and tell it to set up an agent, here’s what actually happens:

It asks you what it needs to know. Don’t have a Telegram bot token yet? The skill walks you through creating one with BotFather, right there in the conversation. Not sure which VPC to use? It lists your VPCs and helps you pick one. No documentation rabbit holes. No context switching.

It makes suggestions. The skill knows what a reasonable security group looks like for this use case. It knows which instance types make sense. It offers sensible defaults and explains when you might want something different.

It verifies the deployment actually works. This is the part that changed our thinking. After provisioning the instance, the skill connects via SSM Session Manager and confirms the agent is running, responding, and connected. Not “the CloudFormation stack completed successfully”—the actual service is up and healthy.

It fixes what’s broken. If the verification step reveals a misconfiguration—wrong permissions, missing dependency, failed service start—the skill doesn’t just report the error. It diagnoses the issue and fixes it. Iteratively, until the deployment is confirmed working.

No CDK stack does that. No Terraform plan does that. They tell you what they intend to create. A skill confirms what actually works.

Just-in-Time Guidance vs. Just-in-Case Documentation

Traditional IaC assumes you’ve already done the research. You’ve read the docs, understood the constraints, and encoded everything into configuration files. The knowledge lives in your head before it lives in the code.

Skills flip this. The knowledge lives in the skill, and it’s delivered just in time—when you need it, in the context where you need it. You don’t read a 40-page setup guide. You have a conversation, and the AI surfaces exactly the information relevant to your situation.

This isn’t just a convenience improvement. It’s a fundamentally different model for how infrastructure knowledge gets applied. Instead of requiring expertise upfront, skills make expertise available on demand.

Where This Excels (and Where It Doesn’t)

Let’s be honest about the trade-offs.

Skills are excellent for:

Read the CDK or Terraform docs for EC2 constructs
Read the OpenClaw docs for agent configuration
Read the Telegram docs for bot token setup
Figure out which VPC to use (or create one)
Configure security groups, IAM roles, key pairs
Write the user data script
Write the stack
Deploy, wait, connect, check if it actually works
It doesn’t. Debug. Redeploy. Repeat.

Traditional IaC still wins for:

Enterprise environments with strict compliance requirements
Identical deployments across hundreds of environments
Infrastructure that rarely changes and needs to be auditable line-by-line
Organizations with dedicated platform engineering teams

The sweet spot for skills is where rigidity becomes a cost. When your deployment needs to adapt to what it finds—different AWS accounts, different VPC setups, different integration requirements—a skill handles that naturally. An IaC stack handles it with conditional logic that gets uglier with every edge case.

What We Built, and Why It Matters

We built our OpenClaw deploy skill because we got tired of maintaining infrastructure code that grew more complex every time we added a configuration option. The skill contains all the same knowledge—networking, security, instance setup, integration wiring—but the AI applies only the parts that matter for what you’re actually deploying. No dead branches. No conditional spaghetti.

The skill handles EC2 provisioning with support for Telegram, Microsoft Teams, or GitLab as the communication channel—or any combination of the three. We’re using it for our own deployments, and it’s handling the edge cases and environment differences that used to mean another round of CDK debugging.

We’re not arguing that AI-guided deployment replaces all infrastructure-as-code. We’re saying that for a growing category of software—AI agents, rapidly evolving applications, custom integrations—the traditional approach creates more friction than it solves.

The infrastructure-as-code movement gave us repeatability and version control. Skills give us adaptability and verification. The best deployments will use both—rigid IaC for the foundation, adaptive skills for the parts that need to flex.

If you’re deploying AI agents or building software that moves faster than your Terraform modules can keep up with, take a look at OpenClaw.

And if you want to see the skill-based approach in action.

We’d love to show you

Why We Stopped Writing Infrastructure Code and Started Describing What We Wanted

How We Got Here

The Problem with Traditional IaC

What a Skill Does Differently

Just-in-Time Guidance vs. Just-in-Case Documentation

Where This Excels (and Where It Doesn’t)

What We Built, and Why It Matters

Ben Tripp

Contact Author

headquarters

New Hampshire

How We Got Here

The Problem with Traditional IaC

What a Skill Does Differently

Just-in-Time Guidance vs. Just-in-Case Documentation

Where This Excels (and Where It Doesn’t)

What We Built, and Why It Matters

Ben Tripp

Contact Author

Related Articles

Cloud Cost Optimization: 10 Strategies & Best Practices

How Generative AI Is Reshaping Software Development

Navigating HIPAA Compliance in the Cloud: 10 Tips for Healthcare Organizations

headquarters

New Hampshire