Extension `chaosdatadog`¶


Version	0.3.1
Repository	https://github.com/chaostoolkit-incubator/chaostoolkit-datadog

This project contains Chaos Toolkit activities and tolerances to work with DataDog.

Install¶

This package requires Python 3.8+

To be used from your experiment, this package must be installed in the Python environment where chaostoolkit already lives.

$ pip install chaostoolkit-datadog

Usage¶

A typical experiment using this extension would look like this:

{
    "version": "1.0.0",
    "title": "Run a, experiment using a DataDog SLO to verify our system",
    "description": "n/a",
    "configuration": {
        "datadog_host": "https://datadoghq.eu"
    },
    "steady-state-hypothesis": {
        "title": "n/a",
        "probes": [
            {
                "type": "probe",
                "name": "read-slo",
                "tolerance": {
                    "type": "probe",
                    "name": "check-slo",
                    "provider": {
                        "type": "python",
                        "module": "chaosdatadog.slo.tolerances",
                        "func": "slo_must_be_met",
                        "arguments": {
                            "threshold": "7d"
                        }
                    }
                },
                "provider": {
                    "type": "python",
                    "module": "chaosdatadog.slo.probes",
                    "func": "get_slo",
                    "arguments": {
                        "slo_id": "..."
                    }
                }
            }
        ]
    },
    "method": []
}

That’s it!

Please explore the code to see existing probes and actions.

Configuration¶

In the configuration block you may want to specify the DataDog host you are targetting:

    "configuration": {
        "datadog_host": "https://datadoghq.eu"
    },

The authentication can be set using the typical DataDog environment variables, notably:

DD_API_KEY: the API key
DD_APP_KEY: the application key

Test¶

To run the tests for the project execute the following:

$ pdm run test

Formatting and Linting¶

We use ruff to both lint and format this repositories code.

Before raising a Pull Request, we recommend you run formatting against your code with:

$ pdm run format

This will automatically format any code that doesn’t adhere to the formatting standards.

As some things are not picked up by the formatting, we also recommend you run:

$ pdm run lint

To ensure that any unused import statements/strings that are too long, etc. are also picked up.

Contribute¶

If you wish to contribute more functions to this package, you are more than welcome to do so. Please, fork this project, make your changes following the usual PEP 8 code style, sprinkling with tests and submit a PR for review.

Exported Activities¶

metrics¶

`get_metrics_state`¶


Type	probe
Module	chaosdatadog.metrics.probes
Name	get_metrics_state
Return	boolean

The next function is to:

Query metrics from any time period (timeseries and scalar)
Compare the metrics to some treshold in some time. Ex.(CPU, Memory, Network)
Check is the sum of datapoins is over some value. Ex. (requests, errors, custom metrics)

you can use a comparison to check if all data points in the query satisfy the steady state condition

Ex. cumsum(sum:istio.mesh.request.count.total{kube_service:test, response_code:500}.as_count())

the above query is a cumulative sum of all requests with response code of 500. if you want your request in a window of time you have a deviant hypothesis if you have more than 30 http_500 errors the comparison should be <. so any value below 30 is a steady state.

the allowed comparison values are [“>”, “<”, “>=”, “<=”, “==”]

Signature:

def get_metrics_state(query: str,
                      comparison: str,
                      threshold: float,
                      minutes_before: int,
                      configuration: Dict[str, Dict[str, str]] = None,
                      secrets: Dict[str, Dict[str, str]] = None) -> bool:
    pass

Arguments:

Name	Type	Required
query	string	Yes
comparison	string	Yes
threshold	number	Yes
minutes_before	integer	Yes

Usage:

JSONYAML

{
  "name": "get-metrics-state",
  "type": "probe",
  "provider": {
    "type": "python",
    "module": "chaosdatadog.metrics.probes",
    "func": "get_metrics_state",
    "arguments": {
      "query": "",
      "comparison": "",
      "threshold": null,
      "minutes_before": 0
    }
  }
}

name: get-metrics-state
provider:
  arguments:
    comparison: ''
    minutes_before: 0
    query: ''
    threshold: null
  func: get_metrics_state
  module: chaosdatadog.metrics.probes
  type: python
type: probe

slo¶

`get_slo`¶


Type	probe
Module	chaosdatadog.slo.probes
Name	get_slo
Return	mapping

Get a SLO’s history for the given period.

Periods should be given relative to each other. If end_period isn’t provided it will resolve to now (UTC). start_period is always relative to end_period. You can use a format such as: "X minutes ago" for both.

Please visit https://docs.datadoghq.com/api/latest/service-level-objectives/#get-an-slos-history for more information on the response payload, which is returned as a dictionary.

Signature:

def get_slo(slo_id: str,
            start_period: str = '2 minutes ago',
            end_period: str = None,
            configuration: Dict[str, Dict[str, str]] = None,
            secrets: Dict[str, Dict[str, str]] = None) -> Dict[str, Any]:
    pass

Arguments:

Name	Type	Default	Required
slo_id	string		Yes
start_period	string	“2 minutes ago”	No
end_period	string	null	No

Usage:

JSONYAML

{
  "name": "get-slo",
  "type": "probe",
  "provider": {
    "type": "python",
    "module": "chaosdatadog.slo.probes",
    "func": "get_slo",
    "arguments": {
      "slo_id": ""
    }
  }
}

name: get-slo
provider:
  arguments:
    slo_id: ''
  func: get_slo
  module: chaosdatadog.slo.probes
  type: python
type: probe

`get_slo_details`¶


Type	probe
Module	chaosdatadog.slo.probes
Name	get_slo_details
Return	mapping

Get a SLO’s details.

Please visit https://docs.datadoghq.com/api/latest/service-level-objectives/#get-an-slos-details for more information on the response payload, which is returned as a dictionary.

Signature:

def get_slo_details(
        slo_id: str,
        configuration: Dict[str, Dict[str, str]] = None,
        secrets: Dict[str, Dict[str, str]] = None) -> Dict[str, Any]:
    pass

Arguments:

Name	Type	Default	Required
slo_id	string		Yes

Usage:

JSONYAML

{
  "name": "get-slo-details",
  "type": "probe",
  "provider": {
    "type": "python",
    "module": "chaosdatadog.slo.probes",
    "func": "get_slo_details",
    "arguments": {
      "slo_id": ""
    }
  }
}

name: get-slo-details
provider:
  arguments:
    slo_id: ''
  func: get_slo_details
  module: chaosdatadog.slo.probes
  type: python
type: probe

`slo_must_be_met`¶


Type	tolerance
Module	chaosdatadog.slo.tolerances
Name	slo_must_be_met
Return	boolean

Checks that the current SLI value of a SLO is higher than its target for a given threshold period ("7d", "30d", "90d", "custom").

Signature:

def slo_must_be_met(threshold: str = '7d',
                    value: Dict[str, Any] = None) -> bool:
    pass

Arguments:

Name	Type	Default	Required
threshold	string	“7d”	No
value	mapping	null	No

Tolerances declare the value argument which is automatically injected by Chaos Toolkit as the output of the probe they are evaluating.

Usage:

JSONYAML

{
  "steady-state-hypothesis": {
    "title": "...",
    "probes": [
      {
        "type": "probe",
        "tolerance": {
          "name": "slo-must-be-met",
          "type": "tolerance",
          "provider": {
            "type": "python",
            "module": "chaosdatadog.slo.tolerances",
            "func": "slo_must_be_met"
          }
        },
        "...": "..."
      }
    ]
  }
}

steady-state-hypothesis:
  probes:
  - '...': '...'
    tolerance:
      name: slo-must-be-met
      provider:
        func: slo_must_be_met
        module: chaosdatadog.slo.tolerances
        type: python
      type: tolerance
    type: probe
  title: '...'

Extension chaosdatadog¶

Install¶

Usage¶

Configuration¶

Test¶

Formatting and Linting¶

Contribute¶

Exported Activities¶

metrics¶

get_metrics_state¶

slo¶

get_slo¶

get_slo_details¶

slo_must_be_met¶

Extension `chaosdatadog`¶

`get_metrics_state`¶

`get_slo`¶

`get_slo_details`¶

`slo_must_be_met`¶