How to deploy a Generative AI application with Terraform? A Quick Guide

4 min readAug 30, 2024

Deploying Generative AI Applications with Terraform

Introduction

In today’s tech landscape, Artificial Intelligence (AI) and Machine Learning (ML) have become increasingly relevant, especially with the rise of Generative AI applications. These apps use neural networks to generate new content, such as images, music, or text, based on existing data. As a result, businesses are eager to integrate these capabilities into their systems.

However, deploying Generative AI applications can be complex and time-consuming, requiring a solid understanding of cloud infrastructure, DevOps practices, and AI frameworks. That’s where Terraform comes in — an Infrastructure as Code (IaC) tool that simplifies the process of provisioning and managing cloud resources.

In this article, we’ll explore the process of deploying a Generative AI application using Terraform. We’ll cover everything from setting up your environment to configuring your AI model and deploying it to production.

Prerequisites

Before we begin, ensure you have the following tools installed:

Terraform (version 1.2 or higher)
AWS CLI (version 2 or higher)
An AWS account with the necessary permissions
A code editor or IDE of your choice

Verify that Terraform is installed correctly by running the following command:

$ terraform --version

Configuring Your AI Model

Next, let’s configure our Generative AI model using a popular framework like TensorFlow or PyTorch. For this example, we’ll use TensorFlow.

First, install TensorFlow using pip:

$ pip3 install tensorflow

Now, create a new Python script that defines your AI model:

import tensorflow as tf

# Define the model architecture
model = tf.keras.Sequential([
    tf.keras.layers.Dense(64, activation='relu', input_shape=(784,)),
    tf.keras.layers.Dense(32, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')
])

# Compile the model
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

Deploying Your AI Model with Terraform

Now that we have our AI model configured, let’s deploy it using Terraform. First, create a new directory for your project and navigate to it in your terminal:

$ mkdir generative-ai-deployment
$ cd generative-ai-deployment

Next, create a new file called main.tf that defines the infrastructure resources needed for deployment:

# main.tf# Define the AWS provider
provider "aws" {
  region = "us-west-2"
}# Create an S3 bucket to store our model files
resource "aws_s3_bucket" "model_bucket" {
  bucket = "generative-ai-models"
  acl    = "public-read"# Set the policy for the bucket
  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Sid       = "PublicReadGetObject"
        Effect    = "Allow"
        Principal = "*"
        Action    = "s3:GetObject"
        Resource = "${aws_s3_bucket.model_bucket.arn}/*"
      }
    ]
  })
}# Create an EC2 instance to run our model
resource "aws_instance" "model_instance" {
  ami           = "ami-abc123"  # Replace with a valid AMI ID for your region
  instance_type = "t2.micro"# Specify the SSH key for connecting to the instance
  key_name = "my-key-pair"  # Replace with your key pair name# Security group to allow SSH access
  vpc_security_group_ids = [aws_security_group.instance_sg.id]# User data to install necessary dependencies
  user_data = <<-EOF
              #!/bin/bash
              sudo yum update -y
              sudo yum install -y python3-pip
              EOF
}# Security group to allow SSH access
resource "aws_security_group" "instance_sg" {
  name        = "allow_ssh"
  description = "Allow SSH inbound traffic"ingress {
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}# Copy the model file from S3 to the EC2 instance
provisioner "remote-exec" {
  connection {
    type        = "ssh"
    host        = self.public_ip
    user        = "ec2-user"
    private_key = file("~/.ssh/my-private-key")  # Replace with your private key path
  }inline = [
    "mkdir -p /home/ec2-user/model_files",
    "aws s3 cp s3://${aws_s3_bucket.model_bucket.bucket}/model.h5 /home/ec2-user/model_files/"
  ]
}

Initialize and Apply Terraform Configuration

Now that you’ve defined your infrastructure, you need to initialize and apply the Terraform configuration.

Initialize Terraform:

Run terraform init to initialize the project. This will download the necessary provider plugins.

$ terraform init

Apply the Terraform Configuration:

Run terraform apply to create the resources defined in main.tf. Terraform will show you a plan of the changes it will make. Review the plan and type yes to confirm.

$ terraform apply

This command will create the S3 bucket, EC2 instance, and other necessary resources. It will also execute the remote-exec provisioner to copy your AI model file from the S3 bucket to the EC2 instance.

Verify the Deployment

Once Terraform has finished applying the configuration, you can verify the deployment:

Check S3 Bucket: Ensure your model file is stored in the S3 bucket.

SSH into the EC2 Instance:

You can SSH into the EC2 instance using the public IP address that Terraform outputs after the apply command.

$ ssh -i ~/.ssh/my-private-key ec2-user@<EC2_PUBLIC_IP>

Check the Model File:

Once logged in, navigate to the /home/ec2-user/model_files/ directory and ensure that the model.h5 file has been copied successfully.

Clean Up

When you’re done with the deployment and testing, you can destroy the infrastructure to avoid unnecessary charges

$ terraform destroy

This command will remove all the resources that were created by Terraform.

Conclusion

In this article, we’ve covered the process of deploying a Generative AI application using Terraform. We’ve set up our environment, configured our AI model, and deployed it using Terraform.

By following these steps, you can deploy your own Generative AI applications with ease. Remember to always follow best practices for security and scalability when deploying AI models in production.

Note: The code provided in this guide is intended for reference purposes only and is not production-ready. Depending on your specific use case, additional configurations, security measures, and optimizations may be required. Make sure to review and adjust the code as necessary to fit your deployment needs before using it in a production environment.

Don’t forget to like, share, and save this for reference later! Spread the knowledge.

🔔 Follow Muhammad Usama Khan here and on LinkedIn for more insightful content on Cloud, DevOps, AI/Data, SRE and Cloud Native.