Extension chaosprometheus
¶
Version | 0.6.0 |
Repository | https://github.com/chaostoolkit-incubator/chaostoolkit-prometheus |
Prometheus support for the Chaos Toolkit.
Install¶
To be used from your experiment, this package must be installed in the Python environment where chaostoolkit already lives.
$ pip install chaostoolkit-prometheus
Usage¶
To use this package, you must create have access to a Prometheus instance via HTTP and be allowed to connect to it.
By default, the Prometheus instance at http://localhost:9090
will be queried. To override, you need to set up the instance details using the prometheus_base_url
configuration property:
"configuration": {
"prometheus_base_url": "http://my.prometheus.server/"
}
This package only exports probes to query for some aspects of your system as monitored by Prometheus.
Here is an example of querying Prometheus at a given moment
{
"type": "probe",
"name": "fetch-cpu-just-2mn-ago",
"provider": {
"type": "python",
"module": "chaosprometheus.probes",
"func": "query",
"arguments": {
"query": "process_cpu_seconds_total{job='websvc'}",
"when": "2 minutes ago"
}
}
}
You can also ask for an interval as follows:
{
"type": "probe",
"name": "fetch-cpu-over-interval",
"provider": {
"type": "python",
"module": "chaosprometheus.probes",
"func": "query_interval",
"arguments": {
"query": "process_cpu_seconds_total{job='websvc'}",
"start": "2 minutes ago",
"end": "now",
"step": 5
}
}
}
In both cases, the probe returns the JSON payload as-is from Prometheus or raises an exception when an error is met.
The result is not further process and should be found in the generated report of the experiment run.
You can also send metrics to a pushgateway service via a control:
{
"controls": [
{
"name": "prometheus",
"provider": {
"type": "python",
"module": "chaosprometheus.metrics",
"arguments": {
"pushgateway_url": "http://someip:9091",
"job": "chaostoolkit"
}
}
}
]
}
You can also set three more arguments:
grouping_key
: A mapping of strings to uniquely aggregate multiple runs in the Prometheus backendtrace_id
: This must be a string which will identify this run uniquely in your metrics. If none is a provided, a random string is generated.experiment_ref
: Sometimes it’s useful to identify a particular experiment, not just its run, throughout many runs. This is the string to do that. If none is provided, a hash of the experiment is performed and used. The hash is not stable across changes of the experiment of course.
These are particularly useful when you couple this extension with others like Loki where you want to cross-reference between logs and metrics.
Contribute¶
If you wish to contribute more functions to this package, you are more than welcome to do so. Please, fork this project, make your changes following the usual PEP 8 code style, sprinkling with tests and submit a PR for review.
Exported Controls¶
metrics¶
This module exports controls covering the following phases of the execution of an experiment:
Level | Before | After |
---|---|---|
Experiment Loading | False | False |
Experiment | False | True |
Steady-state Hypothesis | False | False |
Method | False | False |
Rollback | False | False |
Activities | False | False |
In addition, the controls may define the followings:
Level | Enabled |
---|---|
Validate Control | False |
Configure Control | True |
Cleanup Control | False |
To use this control module, please add the following section to your experiment:
{
"controls": [
{
"name": "chaosprometheus",
"provider": {
"type": "python",
"module": "chaosprometheus.metrics"
}
}
]
}
controls:
- name: chaosprometheus
provider:
module: chaosprometheus.metrics
type: python
This block may also be enabled at any other level (steady-state hypothesis or activity) to focus only on that level.
When enabled at the experiment level, by default, all sub-levels are also applied unless you set the automatic
properties to false
.
Exported Activities¶
metrics¶
probes¶
compute_mean
¶
Type | probe |
Module | chaosprometheus.probes |
Name | compute_mean |
Return | number |
Compute the mean of all returned datapoints of the range vector matching the given query. The query must return a range vector.
The default computes an arithmetic mean. You can switch to geometric or harmonic mean by passing mean_type="geometric"
or mean_type="harmonic"
.
Signature:
def compute_mean(query: str,
window: str = '1d',
mean_type: str = 'arithmetic',
configuration: Dict[str, Dict[str, str]] = None,
secrets: Dict[str, Dict[str, str]] = None) -> float:
pass
Arguments:
Name | Type | Default | Required |
---|---|---|---|
query | string | Yes | |
window | string | “1d” | No |
mean_type | string | “arithmetic” | No |
Usage:
{
"name": "compute-mean",
"type": "probe",
"provider": {
"type": "python",
"module": "chaosprometheus.probes",
"func": "compute_mean",
"arguments": {
"query": ""
}
}
}
name: compute-mean
provider:
arguments:
query: ''
func: compute_mean
module: chaosprometheus.probes
type: python
type: probe
nodes_cpu_usage_mean
¶
Type | probe |
Module | chaosprometheus.probes |
Name | nodes_cpu_usage_mean |
Return | number |
Computes a mean of all nodes activities per minute over the given window
. We use the node_cpu_seconds_total
metric to perform this query.
Signature:
def nodes_cpu_usage_mean(window: str = '1d',
configuration: Dict[str, Dict[str, str]] = None,
secrets: Dict[str, Dict[str, str]] = None) -> float:
pass
Arguments:
Name | Type | Default | Required |
---|---|---|---|
window | string | “1d” | No |
Usage:
{
"name": "nodes-cpu-usage-mean",
"type": "probe",
"provider": {
"type": "python",
"module": "chaosprometheus.probes",
"func": "nodes_cpu_usage_mean"
}
}
name: nodes-cpu-usage-mean
provider:
func: nodes_cpu_usage_mean
module: chaosprometheus.probes
type: python
type: probe
query
¶
Type | probe |
Module | chaosprometheus.probes |
Name | query |
Return | mapping |
Run an instant query against a Prometheus server and returns its result as-is.
Signature:
def query(query: str,
when: str = None,
timeout: float = None,
verify_tls: bool = True,
configuration: Dict[str, Dict[str, str]] = None,
secrets: Dict[str, Dict[str, str]] = None) -> Dict[str, Any]:
pass
Arguments:
Name | Type | Default | Required |
---|---|---|---|
query | string | Yes | |
when | string | null | No |
timeout | number | null | No |
verify_tls | boolean | true | No |
Usage:
{
"name": "query",
"type": "probe",
"provider": {
"type": "python",
"module": "chaosprometheus.probes",
"func": "query",
"arguments": {
"query": ""
}
}
}
name: query
provider:
arguments:
query: ''
func: query
module: chaosprometheus.probes
type: python
type: probe
query_interval
¶
Type | probe |
Module | chaosprometheus.probes |
Name | query_interval |
Return | mapping |
Run a range query against a Prometheus server and returns its result as-is.
The start
and end
arguments can be a RFC 3339 date or expressed more colloquially such as "5 minutes ago"
.
Signature:
def query_interval(
query: str,
start: str,
end: str,
step: int = 1,
timeout: float = None,
verify_tls: bool = True,
configuration: Dict[str, Dict[str, str]] = None,
secrets: Dict[str, Dict[str, str]] = None) -> Dict[str, Any]:
pass
Arguments:
Name | Type | Default | Required |
---|---|---|---|
query | string | Yes | |
start | string | Yes | |
end | string | Yes | |
step | integer | 1 | No |
timeout | number | null | No |
verify_tls | boolean | true | No |
Usage:
{
"name": "query-interval",
"type": "probe",
"provider": {
"type": "python",
"module": "chaosprometheus.probes",
"func": "query_interval",
"arguments": {
"query": "",
"start": "",
"end": ""
}
}
}
name: query-interval
provider:
arguments:
end: ''
query: ''
start: ''
func: query_interval
module: chaosprometheus.probes
type: python
type: probe