Skip to content

Extension chaosreliably

Version 0.1.1
Repository https://github.com/chaostoolkit-incubator/chaostoolkit-reliably

Build

Reliably support for the Chaos Toolkit.

Install

To be used from your experiment, this package must be installed in the Python environment where chaostoolkit already lives.

$ pip install chaostoolkit-reliably

Authentication

To use this package, you must create have registered with Reliably services through their CLI.

You have two ways to pass on the credentials information.

The first one by specifying the path to the Reliably’s configuration file, which defaults to $HOME/.config/reliably/config.yaml. The simplest way to achieve this is by running $ reliably login as this will generate the appropriate file.

{
    "configuration": {
        "reliably_config_path": "~/.config/reliably/config.yaml"
    }
}

Because we use the default path, you may omit this configuration’s entry altogether unless you need a specific different path.

The second one is by setting some environment variables as secrets. This is for specific use case and usually not required.

  • RELIABLY_TOKEN: the token to authenticate against Reliably’s API
  • RELIABLY_HOST:: the hostname to connect to, default to reliably.com
{
    "secrets": {
        "reliably": {
            "token": {
                "type": "env",
                "key": "RELIABLY_TOKEN"
            },
            "host": {
                "type": "env",
                "key": "RELIABLY_HOST",
                "default": "reliably.com"
            }
        }
    }
}

Usage

As Steady Steate Hypothesis

You can use Reliably’s SLO as a mechanism to determine if your system has deviated during a Chaos Toolkit experiment. Here is a simple example:

"steady-state-hypothesis": {
    "title": "We do not consume all of our error budgets during the experiment",
    "probes": [
        {
            "name": "last-3-slos-must-be-ok",
            "type": "probe",
            "provider": {
                "type": "python",
                "module": "chaosreliably.slo.probes",
                "func": "get_last_N_slos",
                "arguments": {
                    "quantity": 3
                }
            },
            "tolerance": {
                "type": "probe",
                "name": "validate-last-3-slo-statuses",
                "provider": {
                    "type": "python",
                    "module": "chaosreliably.slo.tolerances",
                    "func": "last_N_slo_were_met_for_all_services",
                    "arguments": {}
                }
            }
        }
    ]
}

This looks at the last three SLO reports for all your services and ensure all of them are within the error budget they are allowed.

As Safeguards

Safeguards, provided by the Chaos Toolkit addons extension gives you a nice way to interrupt an experiment as soon as error budgets have been consumed. This is orthogonal to the steady-state hypothesis as it is a mechanism to protect your system from being harmed too harshly by an experiment.

"controls": [
    {
        "name": "safeguard",
        "provider": {
            "type": "python",
            "module": "chaosaddons.controls.safeguards",
            "arguments": {
                "probes": [
                    {
                        "name": "we-do-not-have-enough-error-budget-left-to-carry-on",
                        "type": "probe",
                        "frequency": 5,
                        "provider": {
                            "type": "python",
                            "module": "chaosreliably.slo.probes",
                            "func": "get_last_N_slos",
                            "arguments": {
                                "quantity": 3
                            }
                        },
                        "tolerance": {
                            "type": "probe",
                            "name": "validate-last-3-slo-statuses",
                            "provider": {
                                "type": "python",
                                "module": "chaosreliably.slo.tolerances",
                                "func": "last_N_slo_were_met_for_all_services",
                                "arguments": {}
                            }
                        }
                    }
                ]
            }
        }
    }
]

As you can notice it is the same construct as for the steady-state, it’s merely used with a different purpose. Here these probes will be executed every 5s during the experiment (it’s for demo purpose, you would usually only run it once every minute or less).

Contribute

If you wish to contribute more functions to this package, you are more than welcome to do so. Please, fork this project, make your changes following the usual PEP 8 code style, sprinkling with tests and submit a PR for review.

Develop

If you wish to develop on this project, make sure to install the development dependencies. But first, create a virtual environment and then install those dependencies.

$ pip install -r requirements-dev.txt -r requirements.txt 

Then, point your environment to this directory:

$ python setup.py develop

Now, you can edit the files and they will be automatically be seen by your environment, even when running from the chaos command locally.

Test

To run the tests for the project execute the following:

$ pytest

Exported Activities

slo


get_last_N_slos

Type probe
Module chaosreliably.slo.probes
Name get_last_N_slos
Return mapping

Fetch the last N SLO reports in a structure that makes it easy to navigate them, for instance from a tolerance.

Signature:

def get_last_N_slos(
        quantity: int = 5,
        configuration: Dict[str, Dict[str, str]] = None,
        secrets: Dict[str, Dict[str, str]] = None) -> Dict[str, List[Dict]]:
    pass

Arguments:

Name Type Default Required
quantity integer 5 No

Usage:

{
  "name": "get-last-N-slos",
  "type": "probe",
  "provider": {
    "type": "python",
    "module": "chaosreliably.slo.probes",
    "func": "get_last_N_slos"
  }
}
name: get-last-N-slos
provider:
  func: get_last_N_slos
  module: chaosreliably.slo.probes
  type: python
type: probe

get_slo_history

Type probe
Module chaosreliably.slo.probes
Name get_slo_history
Return chaosreliably.slo.types.Reports

Fetch the history of SLO reports as provided by Reliably.

Signature:

def get_slo_history(
    limit: int = 25,
    configuration: Dict[str, Dict[str, str]] = None,
    secrets: Dict[str, Dict[str,
                            str]] = None) -> chaosreliably.slo.types.Reports:
    pass

Arguments:

Name Type Default Required
limit integer 25 No

Usage:

{
  "name": "get-slo-history",
  "type": "probe",
  "provider": {
    "type": "python",
    "module": "chaosreliably.slo.probes",
    "func": "get_slo_history"
  }
}
name: get-slo-history
provider:
  func: get_slo_history
  module: chaosreliably.slo.probes
  type: python
type: probe

last_N_slo_were_met_for_all_services

Type tolerance
Module chaosreliably.slo.tolerances
Name last_N_slo_were_met_for_all_services
Return boolean

Goes through all the SLO generated by get_last_N_slos and ensure all of their reports showed each SLO was met recently.

Fails as soon as one was not met.

Signature:

def last_N_slo_were_met_for_all_services(
        value: Dict[str, Dict] = None) -> bool:
    pass

Arguments:

Name Type Default Required
value mapping null No

Tolerances declare the value argument which is automatically injected by Chaos Toolkit as the output of the probe they are evaluating.

Usage:

{
  "steady-state-hypothesis": {
    "title": "...",
    "probes": [
      {
        "type": "probe",
        "tolerance": {
          "name": "last-N-slo-were-met-for-all-services",
          "type": "tolerance",
          "provider": {
            "type": "python",
            "module": "chaosreliably.slo.tolerances",
            "func": "last_N_slo_were_met_for_all_services"
          }
        },
        "...": "..."
      }
    ]
  }
}
steady-state-hypothesis:
  probes:
  - '...': '...'
    tolerance:
      name: last-N-slo-were-met-for-all-services
      provider:
        func: last_N_slo_were_met_for_all_services
        module: chaosreliably.slo.tolerances
        type: python
      type: tolerance
    type: probe
  title: '...'