Run Chaos Toolkit as a Task¶

AWS offers a range of services suitable to run a Chaos Toolkit experiments. Here we will explore AWS ECS Tasks to achieve this.

Info

We will be using copilot to generate all the necessary AWS resources. Please ensure you have installed the tool.

Create the Task Definition¶

The Task definition describes all the information required to run the process contained in the container image associated with the task. To create the stask and all its resources, run the following commands:

copilot init \
    --app chaostoolkit \ # (1)!
    --name my-chaos \ # (2)!
    --image "chaostoolkit/chaostoolkit:basic" \ # (3)!
    --retries 0 \ # (4)!
    --schedule "@hourly" \ # (5)!
    --env "team-1" \ # (6)!
    --type "Scheduled Job"
    --deploy

Name the stack, make it relatable but unique to your organization and team
The name of the task
Container image to use
Don’t retry the experiment when it fails
Schedule the task repeatedly
Create an environment specific for this task

Warning

copilot does not currently support one-off jobs so you need to set a schedule during the creation of the stack.

However, once the stack is created you can edit it and change it to:

./copilot/my-chaos/manifest.yaml

on:
  schedule: none

Tip

Every change made to the manifest requires the stack to be deployed with:

copilot deploy

Configure the Experiment Location¶

The task created previously does not specify the experiment to be executed.

You can make it available in various fashion to the task:

Create a container image that contains that file. The issue here is that you will have to build the image for every change in the experiment and store as many images as you have experiments to run.
Store the experiment somewhere it can be served over HTTP since the Chaos Toolkit knows how to automatically read over HTTP. In this case, make the following change into the task definition file:
./copilot/my-chaos/manifest.yaml
```
command: ["run", "https://..."]
```
Mount a volume into the task so that the experiment file is dynamically made available to the Chaos Toolkit. You can use EFS to achieve this. In this case, make the following change into the task definition file:
./copilot/my-chaos/manifest.yaml
```
command: ["run", "/home/svc/experiments/experiment.json"]
storage:
  volumes:
    myManagedEFSVolume:
      efs: true
      path: /home/svc/experiments
      read_only: true
```
A final approach is to change the entry point of the base image so that it knows how to fetch the experiment before making it available to the chaos command. For instance, you could have a script that fetches the experiment from an S3 bucket and stores it into the /home/svc/experiment.json.

Run the Task Definition¶

Run the task definition as follows:

copilot job run

Schedule the Task Definition¶

Schedule the task definition by setting the schedule property in your task definition file.

View the Task Run Logs¶

View the most recent logs:

copilot job log

Delete the Task Definition¶

When the tasks and its resources are not necessary any longer, you can remove them with:

copilot app delete --name chaostoolkit