This is documentation for Litmus Docs 2.0.0, which is no longer actively maintained.
For up-to-date documentation, see the latest version (3.0.0-beta8).
Version: 2.0.0
ChaosEngine
The ChaosEngine CR is the main user-facing chaos custom resource with a namespace scope and is designed to hold information around how the chaos experiments are executed. It connects an application instance with one or more chaos experiments,
while allowing the users to specify run level details (override experiment defaults, provide new environment variables and volumes, options to delete or retain experiment pods, etc.,). This CR is also updated/patched with status of the chaos experiments, making it the single source of truth with respect to the chaos.
This section describes the fields in the ChaosEngine spec and the possible values that can be set against the same.
Field
.spec.engineState
Description
Flag to control the state of the chaosengine
Type
Mandatory
Range
active, stop
Default
active
Notes
The engineState in the spec is a user defined flag to trigger chaos. Setting it to active ensures successful execution of chaos. Patching it with stop aborts ongoing experiments. It has a corresponding flag in the chaosengine status field, called engineStatus which is updated by the controller based on actual state of the ChaosEngine.
The applabel in the spec specifies a unique label of the AUT. Usually provided as a quoted string of pattern key=value. Note that if multiple applications share the same label within a given namespace, the AUT is filtered based on the presence of the chaos annotation litmuschaos.io/chaos: "true". If, however, the annotationCheck is disabled, then a random application (pod) sharing the specified label is selected for chaos. It is optional for the infra chaos.
Field
.spec.appinfo.appkind
Description
Flag to specify resource kind of application under test
The appkind in the spec specifies the Kubernetes resource type of the app deployment. The Litmus ChaosOperator supports chaos on deployments, statefulsets and daemonsets. Application health check routines are dependent on the resource types, in case of some experiments. It is optional for the infra chaos
Field
.spec.auxiliaryAppInfo
Description
Flag to specify one or more app namespace-label pairs whose health is also monitored as part of the chaos experiment, in addition to a primary application specified in the .spec.appInfo. NOTE: If the auxiliary applications are deployed in namespaces other than the AUT, ensure that the chaosServiceAccount is bound to a cluster role and has adequate permissions to list pods on other namespaces.
The auxiliaryAppInfo in the spec specifies a (comma-separated) list of namespace-label pairs for downstream (dependent) apps of the primary app specified in .spec.appInfo in case of pod-level chaos experiments. In case of infra-level chaos experiments, this flag specifies those apps that may be directly impacted by chaos and upon which health checks are necessary.
Note: Irrespective of the nature of the chaos experiment, i.e., pod-level (single-app impact/lesser blast radius) or infra-level(multi-app impact/higher blast radius), the .spec.appinfo is a must-fill where the experiment is pointed to at least one primary app whose health is measured as an indicator of the resiliency / success of the chaos experiment.
Flag to specify serviceaccount used for chaos experiment
Type
Mandatory
Range
user-defined (type: string)
Default
n/a
Notes
The chaosServiceAccount in the spec specifies the name of the serviceaccount mapped to a role/clusterRole with enough permissions to execute the desired chaos experiment. The minimum permissions needed for any given experiment is provided in the .spec.definition.permissions field of the respective chaosexperiment CR.
Flag to control annotationChecks on applications as prerequisites for chaos
Type
Optional
Range
true, false
Default
true
Notes
The annotationCheck in the spec controls whether or not the operator checks for the annotation "litmuschaos.io/chaos" to be set against the application under test (AUT). Setting it to true ensures the check is performed, with chaos being skipped if the app is not annotated, while setting it to false suppresses this check and proceeds with chaos injection.
Field
.spec.terminationGracePeriodSeconds
Description
Flag to control terminationGracePeriodSeconds for the chaos pods(abort case)
Type
Optional
Range
integer value
Default
30
Notes
The terminationGracePeriodSeconds in the spec controls the terminationGracePeriodSeconds for the chaos resources in abort case. Chaos pods contains chaos revert upon abortion steps, which continuously looking for the termination signals. The terminationGracePeriodSeconds should be provided in such a way that the chaos pods got enough time for the revert before completely terminated.
Field
.spec.jobCleanupPolicy
Description
Flag to control cleanup of chaos experiment job post execution of chaos
Type
Optional
Range
delete, retain
Default
delete
Notes
The jobCleanupPolicy controls whether or not the experiment pods are removed once execution completes. Set to retain for debug purposes (in the absence of standard logging mechanisms).
The .components.runner.image allows developers to specify their own debug runner images. Defaults for the runner image can be enforced via the operator env CHAOS_RUNNER_IMAGE
Field
.spec.components.runner.imagePullPolicy
Description
Flag to specify imagePullPolicy for the ChaosRunner
Type
Optional
Range
Always, IfNotPresent
Default
IfNotPresent
Notes
The .components.runner.imagePullPolicy allows developers to specify the pull policy for chaos-runner. Set to Always during debug/test.
Field
.spec.components.runner.imagePullSecrets
Description
Flag to specify imagePullSecrets for the ChaosRunner
The .spec.components.runner.secrets provides for a means to push secrets (typically project ids, access credentials etc.,) into the chaos runner pod. These are especially useful in case of platform-level/infra-level chaos experiments.
Field
.spec.components.runner.nodeSelector
Description
Node selectors for the runner pod
Type
Optional
Range
Labels in the from of label key=value
Default
n/a
Notes
The .spec.components.runner.nodeSelector The nodeselector contains labels of the node on which runner pod should be scheduled. Typically used in case of infra/node level chaos.
Field
.spec.components.runner.resources
Description
Specify the resource requirements for the ChaosRunner pod
Type
Optional
Range
user-defined (type: corev1.ResourceRequirements)
Default
n/a
Notes
The .spec.components.runner.resources contains the resource requirements for the ChaosRunner Pod, where we can provide resource requests and limits for the pod.
Field
.spec.components.runner.tolerations
Description
Toleration for the runner pod
Type
Optional
Range
user-defined (type: []corev1.Toleration)
Default
n/a
Notes
The .spec.components.runner.tolerations Provides tolerations for the runner pod so that it can be scheduled on the respective tainted node. Typically used in case of infra/node level chaos.
The experiment[].spec.components.env specifies the array of tunables passed to the experiment pods. Though the field is optional from a chaosengine definition viewpoint, it is almost always necessary to provide experiment tunables via this definition. While some of the env variables override the defaults in the experiment CR and some of the env are mandatory additions filling in for placeholders/empty values in the experimet CR. For a list of "mandatory" & "optional" env for an experiment, refer to the respective experiment documentation.
The experiment[].spec.components.configMaps provides for a means to insert config information into the experiment. The configmaps definition is validated for correctness and those specified are checked for availability (in the cluster/namespace) before being mounted into the experiment pods.
The experiment[].spec.components.secrets provides for a means to push secrets (typically project ids, access credentials etc.,) into the experiment pods. These are especially useful in case of platform-level/infra-level chaos experiments. The secrets definition is validated for correctness and those specified are checked for availability (in the cluster/namespace) before being mounted into the experiment pods.
The .components.runner.experimentImagePullSecrets allows developers to specify the imagePullSecret name for ChaosExperiment.
Field
.spec.experiments[].spec.components.nodeSelector
Description
Provide the node selector for the experiment pod
Type
Optional
Range
Labels in the from of label key=value
Default
n/a
Notes
The experiment[].spec.components.nodeSelector The nodeselector contains labels of the node on which experiment pod should be scheduled. Typically used in case of infra/node level chaos.
Provides the timeout and retry values for the status checks. Defaults to 180s & 90 retries (2s per retry)
Type
Optional
Range
It contains values in the form delay: int, timeout: int
Default
delay: 2s and timeout: 180s
Notes
The experiment[].spec.components.statusCheckTimeouts The statusCheckTimeouts override the status timeouts inside chaosexperiments. It contains timeout & delay in seconds.
Field
.spec.experiments[].spec.components.resources
Description
Specify the resource requirements for the ChaosExperiment pod
Type
Optional
Range
user-defined (type: corev1.ResourceRequirements)
Default
n/a
Notes
The experiment[].spec.components.resources contains the resource requirements for the ChaosExperiment Pod, where we can provide resource requests and limits for the pod.
Annotations that needs to be provided in the pod which will be created (experiment-pod)
Type
Optional
Range
user-defined (type: label key=value)
Default
n/a
Notes
The .spec.components.experimentAnnotation allows developers to specify the custom annotations for the experiment pod.
Field
.spec.experiments[].spec.components.tolerations
Description
Toleration for the experiment pod
Type
Optional
Range
user-defined (type: []corev1.Toleration)
Default
n/a
Notes
The .spec.components.tolerationsTolerations for the experiment pod so that it can be scheduled on the respective tainted node. Typically used in case of infra/node level chaos.
Field
.spec.experiments[].spec.probe
Description
Declarative way to define the chaos hypothesis
Type
Optional
Range
user-defined
Default
n/a
Notes
The .probe allows developers to specify the chaos hypothesis. It supports four types: cmdProbe, k8sProbe, httpProbe, promProbe. For more details refer
The ChaosEngine CR is the user-facing CR which helps in binding the application instance with the ChaosExperiment. It defines the Run Policies and also holds the status of your experiment. This CR helps you customize the experiment according to your need since it can override some of the default characteristics/tunables in your experiment CR.
This CR is also updated/patched with status of the chaos experiments, making it the single source of truth with respect to the chaos.