Back to Blog
Kubernetes8 min read

Demo Series: EKS TF GHA GitOps — Phase 1: Network Setup

Phase 1 of our EKS demo series covering the foundational network setup including VPC, subnets, security groups, and networking prerequisites for a production-grade Kubernetes cluster.

Blog image
Blog image

Table of Contents

Overview

In this blog post, will be implementing the terraform workflow with GitHub Actions to provision the infrastructure resources in AWS Cloud Platform

Getting Started

To get started , there are some pre-requisites to deal with such as

**AWS User Account**

For terraform to provision / read any data / resources on AWS. It needs to have access which is done via IAM User and Role.

Connect to AWS account via management,- Create an IAM user : ([https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html)), enabled programmatic access

  • Attach an IAM Role : ([https://docs.aws.amazon.com/directoryservice/latest/admin-guide/assign_role.html](https://docs.aws.amazon.com/directoryservice/latest/admin-guide/assign_role.html)) that has access to create, modify and delete AWS Resources
  • Create an access id and secret key : ([https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html))
  • **Cloudflare API Token**

    Optional : Domain can be purchased in AWS itself. As I had my domain in Cloudflare, I am using the Cloudflare terarform provider to associate the AWS Hosted Zone NS records with Cloudflare recordsets. To generate an API token on Cloudflare

  • Login to Cloudflare :  [https://dash.cloudflare.com/](https://dash.cloudflare.com/)
  • Click on the domain (in my case jumisa.io) from home page, navigate to Overview page of the domain
  • Click on the link Get your API Token placed below Account ID
  • Blog image
    Blog image
  • Then API Tokens page will be displayed, click on Create Token under User API Tokens
  • Blog image
    Blog image
  • Then there will be list of templates displayed, choose the Edit zone DNS and click on Use Template
  • Blog image
    Blog image
  • Once the template is opened check the permission to be set as Edit for the Zone DNS and Specific zone as jumisa.io (your domain in this case) to be chosen under Zone Resource
  • Leave the other as default, click on Continue to summary and in the proceeding page click on Create Token
  • Blog image
    Blog image
  • As the result, token is generated and displayed like the below. It is one-time copy, please make sure copy the token and store it in your vault. Also use the below curl command to test the token generated
  • Blog image
    Blog image
  • Finally, when the terraform apply is executed from GitHub Actions , the nameservers for the subdomain created in AWS Route53 will be mapped in Cloudflare domain DNS mapping like below
  • Blog image
    Blog image

    Setup GitHub Secrets

    Terraform needs

  • AWS Credentials for working on provisioning resources , reading metadata of any AWS services.
  • Cloudflare API Token to create/map AWS Route53 subdomain nameserver to Cloudlfare
  • So , in the previous steps we have got 3 secrets ie, AWS Access Key ID, Secret Key and Cloudflare Token. These secrets to be added to GitHub repository which will be used in our workflows to get terraform authenticated to AWS and Cloudflare.

    GitHub Secrets are managed in 3 levels such as Environment, Repository and Organisation. In our demo, we are going with Repository Secrets.

  • Click on the Settings tab from GitHub
  • On the left panel under Secret and Variables, click on Actions and then click on New repository secret
  • Blog image
    Blog image
  • Then add the secrets as shown below. Once the secrets are added , one cannot view it only able to update it from the screen
  • Blog image
    Blog image
  • Make sure to create all the 3 SecretsAWS_ACCESS_KEY_ID
  • AWS_SECRET_ACCESS_KEY
  • CLOUDFLARE_API_TOKEN
  • Branching Strategy

    At this stage, important thing to work on how we are going to structure our branches for the GitHub Actions workflows. Following a simple branching strategy ie

  • One branch by environment
  • One branch by feature
  • Branch by Environment :

    Create a branch for infrastructure environment , ie dev, qa, staging, production, etc. These environments are isolated in AWS. In this project, environment is chosen as demo.

    So, branch named demo is created , whenever there is a PullRequest to this branch, infrastructure changes (create,update,destroy) that will be carried out AWS.

    Branch by Feature:

    Create a branch by feature, ie for any specific need/requirement create branch prefixed with feature/. Whenever there is a Push on these feature branches then code scans and infrastructure plan will be generated. For example, in this phase we are building the AWS network resources hence naming the feature branch as feature/aws-network.

    Code on Terraform

    Define Providers & Terraform Block Terraform block to define the providers , State backend.

    Configuration of the provider: - Providers used in this project, which are installed during terraform when they are declared in terraform block under required_providers

  • AWS Provider : hashicorp/aws provider renders all the blocks, methods that are necessary to construct / read resources in AWS Cloud Provider
  • Cloudflare Provider : cloudflare/cloudflare provider used to add the AWS Hosted Zone's NameServer to Cloudflare Domain record set
  • Configuration of the Backend:- s3 is configured as backend inorder to store the terraform state file in a bucket

  • dynamodb_table to hold the lock for the terraform that controls the collision of simultanous execution of terraform command by multiple source (users/pipelines)
  • Setup Remote State

    Create S3 bucket with versioning enabled

    aws s3api create-bucket --bucket > --region us-east-1

    aws s3api put-bucket-versioning --bucket > --versioning-configuration Status=Enabled

    Create DynamoDB table

    aws dynamodb create-table

    --table-name >

    --attribute-definitions AttributeName=LockID,AttributeType=S

    --key-schema AttributeName=LockID,KeyType=HASH

    --provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5

    Define Providers and Remote State

    terraform

    cloudflare =

    required_version = ">=1.2.0"

    backend "s3"

    provider "aws"

    provider "cloudflare"

    Terraform Folder Structure

    ├── README.md└── infrastructure    ├── .auto.tfvars    ├── acm.tf    ├── outputs.tf    ├── route53.tf    ├── terraform.tf    ├── variables.tf    └── vpc.tfCreate .auto.tfvars file - terraform plan, apply and destroy uses this file .auto.tfvars by default to execute if there is no values for the variables are defined under variables.tf

  • As there is only one vpc and shared resources are in this project , workspace concept is not implemented
  • region = "us-east-1"

    r53record =

    name = "eks-demo"

    vpc =

    tags =

    cloudflare_api_token = "" Value declared in github secrets

    Build AWS Network Resources

    In this section we define the AWS Network resources that are to be provisioned in AWS account. Including SSL Ceritficate, Route53 Records sets and mapping nameservers in Cloudflare.

    This block provisions AWS VPC, Subnets, Route Table, Routes, NAT Gateway, Internet Gateway, Default Security Group, etc.

    Also SSL Certificate and Route53 Public Hosted Zone for *.environment.basedomain. Maps the nameservers of the *.environment.basedomain to Cloudflare basedomain

    AWS Network Resources

  • Network resources are defined under infrastructure/vpc.tf file.
  • Variable values are declared under vpc variable in  infrastructure/.auto.tfvars file.
  • Using the source module from terraform registry terraform-aws-modules/vpc/aws.
  • Used 2 Availability Zones in N. Virginia region
  • Enabling public, private and database subnets for the network. (database subnets will be used later in the project when real world 3-tier application is deployed).
  • Enabled only one NAT Gateway by disabling NAT Gateway by Availability Zone.
  • For High Availability NAT Gateway remove one_nat_gateway_per_az from .auto.tfvars and vpc.tf files
  • Create VPC Network for EKS Infrastructure

    This module provisions VPC network that includes

    VPC , Subnets, Routes, Route Tables

    Internet Gateway, NAT Gateway

    Subnets - Public, Private and Database

    module "vpc"

    Route53 & Cloudflare Mapping

  • Route53 Public Hosted Zone and Cloudflare mapping are defined in infrstructure/route53.tf file
  • Variable values are declared under r53record variable in  infrastructure/.auto.tfvars file.
  • Using the source module from terraform registry terraform-aws-modules/route53/aws//modules/zones.
  • Using module defaults as all the default values are going to provision Public Hosted Zone. In case of Private Hosted Zone and VPC Mapping , have to mention it explicitly. Will be covering it later in the project once database module is provisioned.
  • As the Hosted Zone is going to generate 4 nameserver (NS) records, declared the count as 4 in terraform resource cloudflare_record.
  • This resource is going to read the NS records from the output list of module Hosted Zone and create the mapping records respectively on Cloudflare portal
  • Hosted Zone

    Public Hosted zone for the

    sub domain $ and

    base domain $

    Allows the inbound from VPC private subnets

    locals .$"

    module "zones" " = (demo)"

    tags = var.tags

    tags = var.tags

    adding NS to cloudflare

    Base Domain is on CloudFlare

    This resource is going to attach the AWS NS

    entried of Hosted Zone to CloudFlare

    data "cloudflare_zone" "main"

    resource "cloudflare_record" "main"

    SSL Certificate

  • SSL Certificate creation using AWS Certificate Manager is defined in infrstructure/acm.tf file
  • Variable values are declared under r53record variable in  infrastructure/.auto.tfvars file.
  • Using the source module from terraform registry terraform-aws-modules/acm/aws.
  • Certificate will be created for environment.basedomain and alternate domain *.environment.basedomain here environment is subdomain
  • Certification validation method is DNS ie there will be validate record created under the Route53 Hosted Zone.
  • Note: DNS Validation will fail if the Cloudflare NS mapping is either not completed or not success
  • SSL Cert for domain

    Signed by AWS

    Validation method: Route53 DNS

    DomainName: "$.$"

    SAN: "*.$.$"

    module "acm" "

    ]

    wait_for_validation = true

    tags = merge(" },var.tags)

    Create GitHub Workflows

    GitHub Actions : Workflow Structure

    To enable the GitHub Action on any repository, Yaml file must be created under .github/workflows.

    In our project we have multiple yaml files which does code scans and terraform operations.

    .github

    └── workflows

    ├── checkov-check.yml

    ├── plan-deploy-infra.yml

    ├── destroy-infra.yml

    ├── terraform-apply.yml

    ├── terraform-destroy.yml

    ├── terraform-plan.yml

    ├── test-infracode.yml

    ├── tflint-check.yml

    └── tfsec-check.yml

    We have used the branching strategy (mentioned above in this article) with git operations such as Push and Pull Request to define our workflows.

  • Push Request - Whenever there is any push operation to any branch in the repository other than feature/ the workflow defined under test-infracode.yml will be triggered.Push to feature/ branch will trigger the workflow defined in plan-deploy-infra.yml . Runs all the jobs except deploy-infra
  • Pull Request - PR to demo branch from any feature/ branch will trigger all the jobs mentioned under plan-deploy-infra.yml**
  • Code Scans

    Tools such as TFLint, TFSec and Checkov are used to scan Terraform code in this project.

    TFLint

  • Linter tool used to check the deprecated syntax, unused declarations, proper naming conventions, best practises and possible provider specific errors such as wrong instance types, resource classes, etc.
  • Definition for the linter are under .github/workflows/tflint-check.yml file
  • Linter will pass to next steps in the workflow if there is only warning. Because we had set the condition of failure of the step only if it finds any error in the terraform code --minimum-failure-severity=error
  • name: TflintCheckTerraformCode

    on: [workflow_call]

    jobs:

    tflint-checks:

    name: Tflint Checks on Terraform Code

    defaults:

    run:

    shell: bash

    runs-on: ubuntu-latest

    steps:

    Checkout Repository

  • name : Check out Git Repository
  • Print TFLint version

  • name: Show version
  • run: tflint --version

    Install plugins

  • name: Init TFLint
  • run: tflint --init

    Run tflint command in each directory recursively

  • name: Run TFLint
  • run: tflint -f compact --recursive --minimum-failure-severity=error

    TFSec

  • Static Analysis tool used to detect any security issues, misconfigurations based on its built-in rules.
  • For example, it detects configurations such as any Database resources open to public, Security Groups of instance open to public
  • Definition for the linter are under .github/workflows/tfsec-check.yml file
  • TFSec Severity level are CRITICAL, HIGH, MEDIUM, LOW
  • This scan reports skip/ignores all other than CRITICAL severity level ie, steps fails when there is any CRITICAL findings
  • In our case we are skipping 2 rules --exclude aws-ec2-no-excessive-port-access,aws-ec2-no-public-ingress-acl because as we are using terraform registry module for VPC, which can template for public security group and all port open in network ACL
  • name: TfsecCheckTerraformCode

    on: [workflow_call]

    jobs:

    tfsec-checks:

    name: Tfsec Checks on Terraform Code

    defaults:

    run:

    shell: bash

    runs-on: ubuntu-latest

    steps:

    Checkout Repository

  • name : Check out Git Repository
  • uses: actions/checkout}"

    with:

    approvers: prabhu-jumisa

    minimum-approvals: 1

    issue-title: "Deploying Terraform to Jumisa account"

    issue-body: "Please approve or deny this deployment "

    exclude-workflow-initiator-as-approver: false

    additional-approved-words: ''

    additional-denied-words: ''

    Terraform Destroy : Manual Workflow

  • Manual GitHub Workflow
  • Requires a branch and initiator name a the input
  • Workflow defined under terraform-destroy.yml file
  • Run the terraform plan -destroy
  • Creates an issue for manual approval (same as terraform apply)
  • Approver gets an email, when the approver approves the issue by commenting on it, the terraform destroy will be executed.
  • Execute GitHub Workflows

    Push Request

  • When there is Push to feature branch
  • Blog image
    Blog image
  • Workflow Triggered, executed scans and terraform plan
  • Blog image
    Blog image
  • Terraform Plan on Job Summary
  • Blog image
    Blog image

    Pull Request

  • When a PR from feature branch to demo branch
  • Blog image
    Blog image
  • Workflow triggered
  • Blog image
    Blog image
  • As it is PR , sends an email to Approver, Creates an Issue and comments the terraform plan to PR
  • Blog image
    Blog image
  • Terraform plan as Comment on PR
  • Blog image
    Blog image
  • GitHub Issue Created for review and approval
  • Blog image
    Blog image
  •  Manual approval by adding comment as approve
  • Blog image
    Blog image
  • Workflow resumes to execute
  • Blog image
    Blog image
  • Once workflow completed
  • Merge the Pull Request to demo branch
  • Verify AWS Resources

    Connect to AWS Account and cross verify the resources creation

    Blog image
    Blog image

    CleanUp

    Terraform Destroy

  • Connect to GitHub repository and navigate to Actions
  • Choose the Destroy Infrastructure workflow
  • Click on Run Workflow drop-down on the right
  • Choose the branch and mention the deployer name
  • Click on Run workflow
  • Blog image
    Blog image
  • This workflow follow the similar process as Pull Request workflow
  • An email sent to Approver, An issue will be creates
  • Once the approver approves the issue by commenting approve on the issue
  • This workflow will run terraform destroy and cleans up the infrastructure resources created and verified above
  • Next Steps

    Thats All for the Phase 1. Hope this article explains the clarity on base setup.

    Source code on [GitHub Repo](https://github.com/jumisa/terraform-eks-gitops)

    Thanks for your attention and patience to read this article. Stay tuned for the Phases implementation on the upcoming posts.

    Happing reading !!! Let us learn together !!!