RakeCloudspin

This library of Rake tasks is a prototype for an infrastructure project build framework. It is intended as a basis for exploring project structures, conventions, and functionality, but is not currently in a stable state. Feel free to copy and use it, but be prepared to extend and modify it in order to make it usable, and be aware that there isn't likely to be a clean path to upgrade your projects as this thing evolves.

What's the point of this?

Currently, most people and teams managing infrastructure with tools such as Terraform, CloudFormation, etc. define their own project structures, and write their own wrapper scripts to run that tool and associated tasks. Essentially, each project is a unique snowflake.

The goal for cloudspin is to evolve a common structure and build tooling for infrastructure projects, focused on the lifecycle of "stacks" - infrastructure elements provisioned on dynamic infrastructure such as IaaS clouds.

Our hypothesis is that, with a common project structure and tooling:

Teams will spend less time building and maintaining snowflake build systems,
New team members can more quickly get up to speed when joining an infrastructure project,
People can create and share tools and scripts that work with the common structure, creating an ecosystem,
People can create and share infrastructure code for running various software and services, creating a community library.

Philosophy

Convention over configuration. -- The tool should discover elements of the project based on folder structure -- A given configuration value should be set in a single place -- Implies a highly "opinionated" approach
Encourage good agile engineering practices for the infrastructure code -- Writing and running tests should be a natural thing -- Building and using infrastructure pipelines should be a natural thing
Support evolutionary architecture -- Loose coupling of infrastructure elements
Empower developers / users of infrastructure

Structure of a project

Cloudspin is used to manage Terraform projects for AWS infrastructure. It uses Ruby rake. There are some example projects, simple-stack is a simple example.

Each cloudspin project represents a Component. A Component is a collection of stacks (as defined above) that together provide a useful service of some sort. Each instance of a service provisioned in the cloud is a Deployment. You may have Deployments for environments, e.g. a QA deployment, Staging deployment, Production deployment, etc. You might also have multiple production deployments, for example you might provision a deployment for each of your customers.

Project structure

Your component project should have the following basic structure:

COMPONENT-ROOT
  |-- deployment/
  |-- delivery/
  |-- component.yaml
  |-- component-local.yaml
  |-- Rakefile
  └-- go*

Deployment stacks

The COMPONENT-ROOT/deployment/ folder has a subfolder for each stack that is provisioned for a deployment of the component.

COMPONENT-ROOT
  └-- deployment/
      |-- networking/
      |-- cluster/
      └-- database/

In this example, we have one stack for networking (VPC, subnets, etc.), one for a cluster (ECS cluster), and a third for a database (RDS instance).

Stack folders

Each stack has the following structure:

deployment/
└── networking/
    ├── stack.yaml
    ├── infra/
    │   ├── backend.tf
    │   ├── bastion.tf
    │   ├── dns.tf
    │   ├── outputs.tf
    │   ├── subnets.tf
    │   ├── variables.tf
    │   └── vpc.tf
    └── tests/
        └── inspec/
            ├── controls/
            │   ├── bastion.rb
            │   ├── subnets.rb
            │   └── vpc.rb
            └── inspec.yml

See below for details on the stack.yaml file.

Delivery stacks

The COMPONENT-ROOT/delivery/ folder can have a number of subfolders, each representing a stack that provisions things needed for delivery. Each of these is typically provisioned only once per component. Examples include pipeline definitions, and artefact repository configurations.

delivery/
└── aws-pipeline
    ├── infra
    │   ├── artefact_bucket.tf
    │   ├── backend.tf
    │   ├── outputs.tf
    │   ├── packaging_codebuild_stage.tf
    │   ├── pipeline.tf
    │   ├── prodapply_codebuild_stage.tf
    │   ├── testapply_codebuild_stage.tf
    │   └── variables.tf
    └── stack.yaml

Setting up a cloudspin project

These are the steps to set up a new cloudspin infrastructure project:

Import the rake_cloudspin gem.
Create component configuration
Create one or more deployment and delivery stacks

Adding the rake_cloudspin gem to your project

Install the gem

Add this line to your application's Gemfile:

gem 'rake_cloudspin', :git => 'https://github.com/cloudspinners/rake_cloudspin.git'

(TODO: Publish releases of this gem properly)

And then execute:

$ bundle install

Or install it yourself as:

$ gem install rake_cloudspin

Import the library into your Rakefile

Here is an example Rakefile:

require 'rake/clean'
require 'rake_cloudspin'

CLEAN.include('build')
CLEAN.include('work')
CLEAN.include('dist')
CLOBBER.include('vendor')

task :default => [ :plan ]

RakeCloudspin.define_tasks

Many of our example projects use a go script as a wrapper to run rake. This makes sure prerequisites are installed, including the gems. See example go script from the spin-simple-stack project.

Component configuration (component.yaml)

There are two files used to configure your Component, both of which live at the root of the project, alongside the Rakefile.

component.yaml has the default configuration options, and is intended to be checked into source control with the rest of your project.
component-local.yaml allows you to override configuration options when you run cloudspin locally. This is intended to be excluded from source control, so each person who works on the project can have their own custom options.

Here is an example, again from the simpleweb project, of a component.yaml file.

---
estate: cloudspin
component: simple
region: eu-west-1

Some of these configuration variables are used for naming things, others are for configuring infrastructure.

estate is an identifier that runs across all components, all deployments. It may be the name of the organisation, division, etc.
component is the name of this component.
region is the default region for deploying stacks.

Other variables are used to configure infrastructure, generally passed to Terraform code. The specific variables that are available in your component configuration will depend on your own project code. They will tend to be driven by the stack.yaml files for the deployment and delivery stacks in your project.

Stack configuration (stack.yaml)

Each stack in deployment/* and delivery/* must have a stack.yaml file in its root. Otherwise, the cloudspin build won't recognize the stack.

Here's another example from simpleweb.

---
vars:
  region: "%{hiera('region')}"
  component: "%{hiera('component')}"
  deployment_identifier: "%{hiera('deployment_identifier')}"
  estate: "%{hiera('estate')}"
  service: "%{hiera('service')}"
  base_dns_domain: "%{hiera('domain_name')}"

  webserver_ssh_public_key_path: "../ssh_keys/webserver_ssh_key.pub"
  bastion_ssh_public_key_path: "../ssh_keys/bastion_ssh_key.pub"
  allowed_cidr: "%{hiera('my_ip')}/32"

ssh_keys:
  - webserver_ssh_key
  - bastion_ssh_key

state:
  type: s3
  scope: deployment

Terraform variables

The vars: section of the stack.yaml file defines variables that are passed to terraform. See the terraform configuration documentation for how these are used. Cloudspin passes the variables defined in the stack.yaml file to the terraform command on the commandline.

The values in the configuration file can include values from component variables or other variables set by cloudspin. Cloudspin uses hiera to do this, so the syntax is:

"%{hiera('VARIABLE_NAME')}"

SSH keys

Some infrastructure needs ssh keys, for example keypairs used by EC2 instances. Cloudspin can manage these for you if your stack.yaml file has an ssh_key section as below:

ssh_keys:
  - webserver_ssh_key
  - bastion_ssh_key

Each keyname listed in here represents an ssh public/private key pair required by the stack. When run the first time, cloudspin will generate an ssh key pair, and upload both keys to the AWS SSM Parameter Store as values encrypted with KMS. On later runs, Cloudspin will retrieve the existing keys and use those as appropriate.

A separate keypair is used for each deployment of the given stack. So keys are not shared between components, stacks, or environments. They don't need to be checked into version control. Ephemeral test instances of the stack will have keys automatically generated, and these will be destroyed afterwards along with the environment.

The keys are written or downloaded to the local filesystem, so they can be passed to Terraform. In the simpleweb example, two keypairs are generated, and the location of their public keys are passed as vars:

vars:
  ...
  webserver_ssh_public_key_path: "../ssh_keys/webserver_ssh_key.pub"
  bastion_ssh_public_key_path: "../ssh_keys/bastion_ssh_key.pub"

TODO: The location of the keyfiles should be set in variables by cloudspin, so you don't need to know the location.

If you don't want cloudspin to generate ssh keys for you, don't list the keys under the ssh_keys section of the stack.yaml file, and simply give the path to the keyfile you want to use in the vars section.

Running cloudspin tasks

You run cloudspin either by running rake, or using a wrapper like the go script. Our examples assume the go script is used.

${deployment_identifier}

You must set a unique deployment_identifier value for each unique instance of your component. You can set a default in your component.yaml file, although this is dangerous - it will be easy for someone to forget to set the value in some other way, and accidentally make changes to that instance. So if you do this, make sure the named environment is one you don't care about accidentally breaking.

Your **production** environment should of course NEVER be the default `deployment_identifier`.

It's common for each person to set their own deployment_identifier in component-local.yaml, so they can run cloudspin locally to create a personal "sandbox" instance to work on. It's useful to have a naming convention for this, so it's easy to manage instances, e.g. to destroy unneeded developer instances.

Most non-sandbox instances of the component will be provisioned and managed by the pipeline. In these cases, the pipeline configuration will set the deployment_identifier value.

The most common way to set the deployment_identifier is with an environment variable:

DEPLOYMENT_IDENTIFIER=mytest ./go provision

Cloudspin tasks

You can see the tasks by running rake -T or ./go -T.

The main lifecycle tasks are:

plan: Show what Terraform will do to the existing component instance
provision: Create or update all deployment stacks in the instance
test: Run all component tests against the instance
destroy: Completely destroy all deployment stacks in the instance
vars: Show the Terraform variables that will be set by cloudspin

Each of these commands can be run to affect all deployment stacks in the instance. They will not affect the delivery stacks.

It's also possible to run these commands for a specific deployment stack (or delivery stack):

rake deployment:simpleweb:plan         # Plan deployment-simpleweb using terraform
rake deployment:simpleweb:provision    # Provision deployment-simpleweb using terraform
rake deployment:simpleweb:test         # Run inspec tests
rake deployment:simpleweb:destroy      # Destroy deployment-simpleweb using terraform
rake deployment:simpleweb:vars         # Show terraform variables for stack 'simpleweb'

Replace simpleweb with the name of a different stack as appropriate. For delivery stacks, the syntax is to rake delivery:STACKNAME:task.

Runtime details

What's the work folder about?

General info

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/[USERNAME]/rake_cloudspin.

Components

This is largely based on code from Infrablocks, and uses some components, including rake_terraform.

rake_cloudspin

Development

Runtime