-
Notifications
You must be signed in to change notification settings - Fork 135
Description
Description
When attempting to use the TARGETS environment variable to target standalone pods (specifically Argo Workflow pods that are not managed by Deployment/StatefulSet/DaemonSet), the chaos experiment fails with "no pod found for specified target" error, even though the pod exists and RBAC permissions are correctly configured.
Environment
- Litmus Version: 3.23.0
- Litmus Runner Image:
litmuschaos/chaos-runner:3.23.0 - Litmus Go-Runner Image:
litmuschaos/go-runner:3.23.0 - Kubernetes Version: AKS
- Experiment:
pod-memory-hog
Steps to Reproduce
- Create a ChaosEngine targeting an Argo Workflow pod (standalone pod with no controller):
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: memory-chaos
namespace: litmus
spec:
appinfo:
appns: default
applabel: workflows.argoproj.io/workflow=wf-yyyy
appkind: ""
chaosServiceAccount: litmus
jobCleanUpPolicy: retain
engineState: active
components:
runner:
image: litmuschaos/chaos-runner:3.23.0
experiments:
- name: pod-memory-hog
spec:
components:
env:
- name: TOTAL_CHAOS_DURATION
value: "60"
- name: MEMORY_CONSUMPTION
value: "1024"
- name: TARGETS
value: "pods:default:wf-yyyy-container-xxxxxx"- Verify the target pod exists:
kubectl get pod wf-yyyy-container-xxxxxxx -n default
# Output: Pod exists and is Running- Verify RBAC permissions:
kubectl auth can-i get pods --as=system:serviceaccount:litmus:litmus -n default
# Output: yes
kubectl auth can-i list pods --as=system:serviceaccount:litmus:litmus -n default
# Output: yes- Check experiment logs:
kubectl logs pod-memory-hog-xxxxx -n litmusExpected Behavior
The chaos experiment should successfully target the standalone pod and inject memory chaos.
Actual Behavior
The experiment fails with the following error:
time="2025-12-08T21:14:08Z" level=info msg="The application information is as follows" Targets="[{namespace: default, kind: pods, names: [wf-yyyy-container-xxxxxxx]}]"
time="2025-12-08T21:14:15Z" level=error msg="[Error]: pod memory hog failed, err: could not get target pods\n --- at /litmus-go/chaoslib/litmus/stress-chaos/lib/stress-chaos.go:69 (PrepareAndInjectStressChaos) ---\nCaused by: could not get target pods when TARGET_PODS env not set\n --- at /litmus-go/pkg/utils/common/pods.go:186 (GetPodList) ---\nCaused by: could not get pods from workloads\n --- at /litmus-go/pkg/utils/common/pods.go:305 (GetTargetPodsWhenTargetPodsENVNotSet) ---\nCaused by: {\"errorCode\":\"TARGET_SELECTION_ERROR\",\"reason\":\"no pod found for specified target\",\"target\":\"{namespace: default, kind: pods, name: wf-yyyy-container-xxxxxxx}\"}"
Additional Context
Also tried with label selector approach:
- name: TARGETS
value: "labels:default:workflows.argoproj.io/workflow=wf-yyyy"Result: Same error - "no target pods found"
RBAC Configuration:
The litmus service account has proper Role and RoleBinding in the target namespace with all necessary permissions (list, get, create, patch, update, delete on pods).
Root Cause:
Looking at the error stack trace, it appears the code is trying to "get pods from workloads" even when the target kind is explicitly set to pods. This suggests the experiment is incorrectly attempting to query workload resources (Deployments, StatefulSets, etc.) instead of directly fetching the pod by name.
Use Case
Our use case involves running chaos experiments on Argo Workflow pods during workflow execution. These pods are standalone (no Deployment/StatefulSet controller) and are labeled with workflows.argoproj.io/workflow=<workflow-name>.
Questions
- Are there any workarounds for targeting standalone pods?
- Should standalone pods (without workload controllers) be supported, or is this a known limitation?