Today, a variety of code generation tools are available, including low-code and no-code platforms, code completion, code refactoring utilities, and automatically generated APIs. While these tools employ different techniques and algorithms for code generation, their main goal is to speed up the dev process while ensuring the code remains useable, maintainable and compatible.
In this blog we will be trying some of the most popular ‘free to use’, publicly accessible tools to see how they get on generating and transforming code specifically to Terraform. Firstly we will look at their cost before then comparing them.
The list of the tools that we have chosen for the comparison is below :
- ChatGPT (GPT-4)
- GPT Marketplace - Terraform Expert
- Claude 3.5
- Gemini
- Perplexity
- Stakpak
- Overmind
- Amazon Q
- Amazon Q Developer
- GPT-Script
There are obviously many more AI tools out there such as Github Co-pilot but they did not meet the criteria of ‘free to use’ at $10 per month. If you would like to see any others tested please drop a message in our discord.
Below is a table of the tools tested and at the time of writing this (Thursday 17th October) the current costs of each of the tools.
Comparison Test #1: Generating Terraform Code for EKS Clusters with Audit Logging
For our first test we will be asking our tools to create an Amazon Elastic Kubernetes Service (EKS) cluster using Terraform with audit logging enabled.
Objective: Evaluate whether our AI tools can effectively generate Terraform configurations that:
- Enable audit logging for EKS clusters.
- Create a CloudWatch log group for logging management.
Evaluation Criteria:
- Correctness: Ensures audit logging is enabled and a CloudWatch log group is created.
- Completeness: A comprehensive setup of necessary resources and configurations.
- Usability: Code structure should be readable, reusable, and maintainable.
For each of the tests we will ask the tool a variation of the below question:
Can you write terraform code to create a EKS cluster with audit enabled?
Results
For each we have provided a summarised pros, cons and conclusion. If you are interested to view the full code snippets they are located in a public github repo linked under each tool.
ChatGPT (GPT-4)
- Pros: Provides detailed setups beneficial for learning.
- Cons: Lacks modularity and missed creating a CloudWatch log group.
- Conclusion: Falls short by missing critical elements like the log group.
View code
GPT Marketplace - Terraform Expert
Essentially a system prompt on-top of ChatGPT-4.
- Pros: Utilises Terraform AWS modules for VPC and EKS, which are well-tested and maintained by the community
- Cons: Slightly more complex than necessary
- Conclusion: Is a improvement over using just GPT-4
View code
Claude 3.5
- Pros: Accurately created the CloudWatch log group and enabled audit logging.
- Cons: Limited in advanced features compared to more comprehensive solutions.
- Conclusion: Suitable for quick setups with best-practice naming conventions.
View code
Gemini
- Pros: Covers the key components including audit logging and IAM roles.
- Cons: The IAM policy uses a wildcard
"Resource": "*"
, which can pose significant security risks. It's crucial to restrict permissions to only what is necessary. - Conclusion: Gemini offers a solid foundation for setting up an EKS cluster, but it needs improvements in security and modularity.
View code
Perplexity
A ‘free-to-use’ ai powered search engine.
- Pros: Correctly activated audit logging.
- Cons: Omitted setting up the vital CloudWatch log group.
- Conclusion: Lacks key components making it a less reliable choice.
View code
Stakpak
Stakpak is an AI-powered DevOps IDE that helps you build, maintain and self-serve software infrastructure.
- Pros: Most comprehensive with parameterised variables and proper dependency management.
- Cons: Overlooked creating a role and VPC configuration, requiring manual setup. Fixed Node Scaling: The node group scaling settings fix the size to
node_count
, which doesn't allow for autoscaling - Conclusion: Best choice for complex setups with flexible, reusable modules.
View code
Overmind
Overmind Assistant is an interactive, LLM-powered chat tool that can help you troubleshoot incidents, explore applications, write documentation and generate terraform code.
- Pros: Offers a straightforward setup that allows for customisation due to its more detailed resource configuration.
- Cons: Manual configuration doesn't benefit from the ongoing improvements and testing found in community-maintained modules.
- Conclusion: Overmind offers customisable resource configurations ideal for tailored setups but requires more maintenance due to its lack of modular abstraction
View code
Amazon Q
Amazon Q is their generative AI–powered assistant that can answer questions, provide summaries, generate content.
- Pros: N/A
- Cons: Failed to write any code.
- Conclusion: At the time of testing not able to write any Terraform code.
Amazon Q Developer
Amazon Q Developer provides real-time code suggestions, from snippets to full functions, based on your comments and existing code.
- Pros: It delivers a complete EKS setup with VPC, subnets, IAM roles, and CloudWatch logging, while employing modular design and dynamic availability zones for scalability.
- Cons: The config lacks IGWs, NATs, uses hardcoded values, limited tags, and minimal IAM policy attachments.
- Conclusion: Strong in setup and logging, the configuration requires better networking, flexible parameters, expanded IAM policies, and detailed tagging for enhanced adaptability.
View code
GPT-Script
GPTScript is a framework enabling Large Language Models to interact with local or remote systems, including executables, applications with OpenAPI schemas, SDKs, and RAG-based solutions, with minimal prompt requirements.
- Pros: Offers a streamlined EKS setup with necessary logging, IAM roles, an extra policy for broader operations, and flexible subnet calculations using
cidrsubnet
. - Cons: Lacks detailed subnet tagging, modular VPC setups, and parameterization, which can limit flexibility across environments.
- Conclusion: GPT-script provides an efficient EKS setup for quick deployments but may need customisation for complex environments and network adaptability. It is different from just the ChatGPT-4 one which shows they must have a system prompt in place to aid results.
View code
Comparison Test #2: Transform unmanaged AWS config into Terraform code
In this test we compare our tool’s generation capabilities by transforming existing configuration into Terraform code. The reason for this test is that you may not always be looking to provision new infrastructure but instead take what you have already and bring it under terraform management. Now, tools like firefly.ai offer a fully-fledged platform for exactly this and would suit much larger transformational projects. The aim of this test is to look at tools that are freely available, or available at a low cost.
There are also free ‘non-ai’ tools out there such as Terraformer by Google which can help you but for the scope of this blog we are only focusing on AI ones.
Objective: To evaluate how effectively AI tools can assist in generating Terraform code from unmanaged AWS infrastructure, ensuring all resources are identified and correctly configured.
Evaluation Criteria:
- Discovery Completeness: Ability to identify all existing AWS resources within an account.
- Accuracy: Correctly translating configurations, including inter-resource dependencies.
- Efficiency: Providing neatly organised and human-readable output.
- Customisability: Facilitating future adaptations through modular and reusable code.
Testing Method
For this test, we prompted each tool with a similar set of AWS configurations, asking for a complete Terraform translation. Each tool was evaluated on its output quality and resource management efficiency.
For this test we had to rule out the following tools as they do not offer the ability to discover AWS resources and therefore will only be able to give general advice rather than specific results:
- ChatGPT (GPT-4)
- GPT Marketplace - Terraform Expert
- Claude 3.5
- Gemini
- Perplexity
- Stackpack
- Amazon Q Developer
- Github Co-pilot
- GPT-Script
This left the following tools to test:
- Overmind
- Amazon Q
Amazon Q
- Pros: N/A
- Cons: Failed to write any code.
- Conclusion: Amazon Q can access your AWS infrastructure, and therefore in theory has access to the data required to transform it into Terraform. However it tells us that it can’t, even though it definitely can. We suspect this is a constraint of the model not understanding the question, and the safety features being overzealous.
Overmind
- Pros: Can discover and identify all existing AWS resources within an account, mapping these out in a graph. This means that it does a great job of including dependencies that exist in your AWS, even if not prompted.
- Cons: Currently only available in their in-browser app, requires copying to get into your IDE
- Conclusion: Being the only tool out of the list that can take any existing AWS infrastructure and convert that to Terraform (in this case using GPT-4o) means that it comes out on top for this test.
View code
We hope you found this helpful in deciding. The aim will be to both maintain and add new tools as the space continues to If you have other tools you're curious about or tips you'd like to share, please reach out on our Discord or drop us a PR on the Github repo:
https://github.com/jameslaneovermind/ai-tools-for-devops-comparisons