Configuring DaemonSets

DaemonSets are fully supported by Headwind with the same annotation-based configuration as Deployments and StatefulSets. DaemonSets ensure that all (or some) nodes run a copy of a pod, making them ideal for node-level services.

Why DaemonSets?

Use DaemonSets for applications that need to run on every node:

Node monitoring: Prometheus Node Exporter, Datadog agents, New Relic
Log collection: Fluentd, Fluent Bit, Logstash
Network plugins: Calico, Weave, Cilium
Storage daemons: Ceph, GlusterFS
Security agents: Falco, Aqua Security

Supported Annotations

DaemonSets support the exact same annotations as Deployments and StatefulSets:

Annotation	Type	Default	Description
`headwind.sh/policy`	string	`none`	Update policy: `none`, `patch`, `minor`, `major`, `all`, `glob`, `force`
`headwind.sh/pattern`	string	-	Glob pattern (required for `glob` policy)
`headwind.sh/require-approval`	boolean	`true`	Whether updates require manual approval
`headwind.sh/min-update-interval`	integer	`300`	Minimum seconds between updates
`headwind.sh/images`	string	-	Comma-separated list of images to track
`headwind.sh/auto-rollback`	boolean	`false`	Enable automatic rollback on failures
`headwind.sh/rollback-timeout`	integer	`300`	Health check monitoring duration (seconds)
`headwind.sh/health-check-retries`	integer	`3`	Failed health checks before rollback

Basic Configuration

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: node-exporter
  namespace: monitoring
  annotations:
    # Allow minor version updates
    headwind.sh/policy: "minor"

    # Auto-update without approval (monitoring agent)
    headwind.sh/require-approval: "false"

    # Update every 12 hours max
    headwind.sh/min-update-interval: "43200"
spec:
  selector:
    matchLabels:
      app: node-exporter
  template:
    metadata:
      labels:
        app: node-exporter
    spec:
      hostNetwork: true
      hostPID: true
      containers:
      - name: node-exporter
        image: prom/node-exporter:v1.5.0
        args:
        - --path.procfs=/host/proc
        - --path.sysfs=/host/sys
        - --path.rootfs=/host/root
        ports:
        - containerPort: 9100
          name: metrics
        volumeMounts:
        - name: proc
          mountPath: /host/proc
          readOnly: true
        - name: sys
          mountPath: /host/sys
          readOnly: true
        - name: root
          mountPath: /host/root
          readOnly: true
      volumes:
      - name: proc
        hostPath:
          path: /proc
      - name: sys
        hostPath:
          path: /sys
      - name: root
        hostPath:
          path: /

Update Workflow

When a new image version is detected:

Detection: Headwind detects via webhook or polling
Policy Check: Validates version against policy
Interval Check: Ensures minimum interval has elapsed
UpdateRequest: Creates UpdateRequest CRD (if approval required)
Approval: Waits for approval via API (if required)
Application: Updates DaemonSet spec
Rolling Update: Kubernetes updates pods node-by-node
Notification: Sends notifications
History: Records update in annotations

info

DaemonSet updates follow Kubernetes' rolling update strategy: pods are updated node-by-node based on the configured maxUnavailable setting. This ensures continuous coverage across all nodes.

Log Collection Example

Fluent Bit for cluster-wide log collection:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluent-bit
  namespace: logging
  annotations:
    # Only patch versions (bug fixes)
    headwind.sh/policy: "patch"

    # Require approval for production logs
    headwind.sh/require-approval: "true"

    # Wait 24 hours between updates
    headwind.sh/min-update-interval: "86400"

    # Enable auto-rollback
    headwind.sh/auto-rollback: "true"
    headwind.sh/rollback-timeout: "600"
spec:
  selector:
    matchLabels:
      app: fluent-bit
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1  # Update one node at a time
  template:
    metadata:
      labels:
        app: fluent-bit
    spec:
      serviceAccountName: fluent-bit
      containers:
      - name: fluent-bit
        image: fluent/fluent-bit:2.0.0
        ports:
        - containerPort: 2020
          name: metrics
        readinessProbe:
          httpGet:
            path: /api/v1/health
            port: 2020
          periodSeconds: 10
        livenessProbe:
          httpGet:
            path: /api/v1/health
            port: 2020
          periodSeconds: 30
        volumeMounts:
        - name: varlog
          mountPath: /var/log
          readOnly: true
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
        - name: fluent-bit-config
          mountPath: /fluent-bit/etc/
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      - name: fluent-bit-config
        configMap:
          name: fluent-bit-config

Network Plugin Example

Calico CNI plugin with conservative updates:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: calico-node
  namespace: kube-system
  annotations:
    # Only patch versions (critical fixes only)
    headwind.sh/policy: "patch"

    # Always require approval (networking is critical)
    headwind.sh/require-approval: "true"

    # Wait 7 days between updates
    headwind.sh/min-update-interval: "604800"

    # Auto-rollback on failure
    headwind.sh/auto-rollback: "true"
    headwind.sh/rollback-timeout: "300"
    headwind.sh/health-check-retries: "2"
spec:
  selector:
    matchLabels:
      k8s-app: calico-node
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1  # Very conservative
  template:
    metadata:
      labels:
        k8s-app: calico-node
    spec:
      hostNetwork: true
      serviceAccountName: calico-node
      containers:
      - name: calico-node
        image: calico/node:v3.25.0
        env:
        - name: DATASTORE_TYPE
          value: kubernetes
        - name: WAIT_FOR_DATASTORE
          value: "true"
        - name: NODENAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        - name: CALICO_NETWORKING_BACKEND
          value: bird
        - name: CLUSTER_TYPE
          value: k8s,bgp
        securityContext:
          privileged: true
        readinessProbe:
          exec:
            command:
            - /bin/calico-node
            - -felix-ready
          periodSeconds: 10
        livenessProbe:
          httpGet:
            path: /liveness
            port: 9099
          periodSeconds: 10
          initialDelaySeconds: 10
        volumeMounts:
        - name: lib-modules
          mountPath: /lib/modules
          readOnly: true
        - name: var-run-calico
          mountPath: /var/run/calico
        - name: var-lib-calico
          mountPath: /var/lib/calico
      volumes:
      - name: lib-modules
        hostPath:
          path: /lib/modules
      - name: var-run-calico
        hostPath:
          path: /var/run/calico
      - name: var-lib-calico
        hostPath:
          path: /var/lib/calico

Monitoring Agent Example

Datadog agent with automatic updates:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: datadog-agent
  namespace: monitoring
  annotations:
    # Allow minor versions (new features)
    headwind.sh/policy: "minor"

    # Auto-update (monitoring can self-heal)
    headwind.sh/require-approval: "false"

    # Update every 6 hours max
    headwind.sh/min-update-interval: "21600"

    # Enable auto-rollback
    headwind.sh/auto-rollback: "true"
spec:
  selector:
    matchLabels:
      app: datadog-agent
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 10%  # Update 10% of nodes at once
  template:
    metadata:
      labels:
        app: datadog-agent
    spec:
      serviceAccountName: datadog-agent
      containers:
      - name: agent
        image: datadog/agent:7.42.0
        env:
        - name: DD_API_KEY
          valueFrom:
            secretKeyRef:
              name: datadog-secret
              key: api-key
        - name: DD_KUBERNETES_KUBELET_HOST
          valueFrom:
            fieldRef:
              fieldPath: status.hostIP
        - name: DD_LOGS_ENABLED
          value: "true"
        - name: DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL
          value: "true"
        - name: DD_PROCESS_AGENT_ENABLED
          value: "true"
        readinessProbe:
          httpGet:
            path: /ready
            port: 5555
          periodSeconds: 10
        livenessProbe:
          httpGet:
            path: /live
            port: 5555
          periodSeconds: 10
        volumeMounts:
        - name: dockersocket
          mountPath: /var/run/docker.sock
          readOnly: true
        - name: procdir
          mountPath: /host/proc
          readOnly: true
        - name: cgroups
          mountPath: /host/sys/fs/cgroup
          readOnly: true
      volumes:
      - name: dockersocket
        hostPath:
          path: /var/run/docker.sock
      - name: procdir
        hostPath:
          path: /proc
      - name: cgroups
        hostPath:
          path: /sys/fs/cgroup

Security Agent Example

Falco security monitoring with glob pattern:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: falco
  namespace: security
  annotations:
    # Only stable releases
    headwind.sh/policy: "glob"
    headwind.sh/pattern: "*-stable"

    # Require approval
    headwind.sh/require-approval: "true"

    # Wait 3 days between updates
    headwind.sh/min-update-interval: "259200"
spec:
  selector:
    matchLabels:
      app: falco
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
  template:
    metadata:
      labels:
        app: falco
    spec:
      hostNetwork: true
      hostPID: true
      serviceAccountName: falco
      containers:
      - name: falco
        image: falcosecurity/falco:0.34.0-stable
        args:
        - /usr/bin/falco
        - --cri
        - /run/containerd/containerd.sock
        - -K
        - /var/run/secrets/kubernetes.io/serviceaccount/token
        - -k
        - https://kubernetes.default
        - -pk
        securityContext:
          privileged: true
        volumeMounts:
        - name: dev
          mountPath: /host/dev
        - name: proc
          mountPath: /host/proc
          readOnly: true
        - name: boot
          mountPath: /host/boot
          readOnly: true
        - name: lib-modules
          mountPath: /host/lib/modules
          readOnly: true
        - name: usr
          mountPath: /host/usr
          readOnly: true
        - name: etc
          mountPath: /host/etc
          readOnly: true
      volumes:
      - name: dev
        hostPath:
          path: /dev
      - name: proc
        hostPath:
          path: /proc
      - name: boot
        hostPath:
          path: /boot
      - name: lib-modules
        hostPath:
          path: /lib/modules
      - name: usr
        hostPath:
          path: /usr
      - name: etc
        hostPath:
          path: /etc

Node-Specific DaemonSets

Run DaemonSets only on specific nodes using nodeSelector:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: gpu-driver
  namespace: kube-system
  annotations:
    headwind.sh/policy: "patch"
    headwind.sh/require-approval: "true"
spec:
  selector:
    matchLabels:
      app: gpu-driver
  template:
    metadata:
      labels:
        app: gpu-driver
    spec:
      nodeSelector:
        gpu: "true"  # Only run on GPU nodes
      containers:
      - name: nvidia-driver
        image: nvidia/driver:515.48.07
        securityContext:
          privileged: true

Or use affinity for more complex node selection:

spec:
  template:
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: node.kubernetes.io/instance-type
                operator: In
                values:
                - c5.large
                - c5.xlarge

Update Strategy Considerations

Conservative Rolling Updates

For critical infrastructure (networking, security):

spec:
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1  # One node at a time

Faster Rolling Updates

For monitoring and logging (can tolerate brief gaps):

spec:
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 25%  # Quarter of nodes simultaneously

OnDelete Strategy

Manual control over pod updates:

spec:
  updateStrategy:
    type: OnDelete  # Pods updated only when manually deleted

warning

With OnDelete strategy, Headwind will update the DaemonSet spec but pods won't be recreated until you manually delete them. This gives maximum control but requires manual intervention.

Private Registry Support

DaemonSets work with private registries using imagePullSecrets:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: private-agent
  annotations:
    headwind.sh/policy: "minor"
spec:
  template:
    spec:
      imagePullSecrets:
      - name: registry-credentials
      containers:
      - name: agent
        image: myregistry.com/agent:1.0.0

Monitoring Updates

View Update History

# Get update history from annotations
kubectl get daemonset node-exporter -n monitoring \
  -o jsonpath='{.metadata.annotations.headwind\.sh/update-history}' | jq

# Example output
[
  {
    "container": "node-exporter",
    "image": "prom/node-exporter:v1.5.0",
    "timestamp": "2025-11-06T10:30:00Z",
    "updateRequestName": "node-exporter-update-v1-5-0",
    "approvedBy": "webhook"
  },
  {
    "container": "node-exporter",
    "image": "prom/node-exporter:v1.4.0",
    "timestamp": "2025-10-20T14:15:00Z",
    "updateRequestName": "node-exporter-update-v1-4-0",
    "approvedBy": "admin@example.com"
  }
]

Check UpdateRequests

# List pending updates for DaemonSets
kubectl get updaterequests -A -o json | \
  jq '.items[] | select(.spec.targetRef.kind == "DaemonSet")'

Monitor Pod Updates

Watch DaemonSet rollout progress:

# Check rollout status
kubectl rollout status daemonset/fluent-bit -n logging

# Watch pod updates across nodes
kubectl get pods -n logging -l app=fluent-bit -o wide --watch

# Check how many nodes are running updated pods
kubectl get daemonset fluent-bit -n logging

Metrics

Monitor DaemonSet updates with Prometheus:

# DaemonSets being watched
headwind_daemonsets_watched

# Updates applied to DaemonSets
headwind_updates_applied_total{kind="DaemonSet"}

# Pending updates for DaemonSets
headwind_updates_pending{kind="DaemonSet"}

# Rollback operations for DaemonSets
headwind_rollbacks_total{kind="DaemonSet"}

Best Practices

1. Conservative Policies for Critical Infrastructure

For networking, security, and storage:

annotations:
  headwind.sh/policy: "patch"  # Only security fixes
  headwind.sh/require-approval: "true"  # Always require approval
  headwind.sh/min-update-interval: "604800"  # Wait 1 week

2. More Permissive for Observability

For monitoring and logging:

annotations:
  headwind.sh/policy: "minor"  # Allow feature updates
  headwind.sh/require-approval: "false"  # Auto-update
  headwind.sh/min-update-interval: "21600"  # Wait 6 hours

3. Enable Auto-Rollback

Always enable for production DaemonSets:

annotations:
  headwind.sh/auto-rollback: "true"
  headwind.sh/rollback-timeout: "600"  # 10 minutes
  headwind.sh/health-check-retries: "2"

4. Configure Proper Update Strategy

Match maxUnavailable to your tolerance:

spec:
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      # Critical: 1 node at a time
      maxUnavailable: 1

      # OR Non-critical: 25% of nodes
      maxUnavailable: 25%

5. Use Health Checks

Essential for automatic rollback:

readinessProbe:
  httpGet:
    path: /ready
    port: 9090
  periodSeconds: 10

livenessProbe:
  httpGet:
    path: /health
    port: 9090
  periodSeconds: 30

6. Test Node Coverage

After updates, verify all nodes are covered:

# Check number of desired vs current pods
kubectl get daemonset -n monitoring

# Verify pods on each node
kubectl get pods -n monitoring -o wide | grep node-exporter

# Count nodes
kubectl get nodes --no-headers | wc -l

7. Environment-Specific Policies

Different policies per environment:

# Production - very conservative
headwind.sh/policy: "patch"
headwind.sh/require-approval: "true"
headwind.sh/min-update-interval: "604800"  # 1 week

# Development - permissive
headwind.sh/policy: "all"
headwind.sh/require-approval: "false"
headwind.sh/min-update-interval: "3600"  # 1 hour

Troubleshooting

DaemonSet Not Updating

Check if pods are running on all nodes:

# Get DaemonSet status
kubectl get daemonset -n monitoring

# Check for pod scheduling issues
kubectl get pods -n monitoring -o wide

# Look for node taints or constraints
kubectl describe nodes | grep -A 5 Taints

Pod Stuck on Node

DaemonSet updates wait for pod to be Ready:

# Check specific pod
kubectl describe pod fluent-bit-xyz -n logging

# Check pod logs
kubectl logs fluent-bit-xyz -n logging

# Check node conditions
kubectl describe node node-1 | grep Conditions -A 10

Update Too Slow

If maxUnavailable: 1 is too slow:

spec:
  updateStrategy:
    rollingUpdate:
      maxUnavailable: 3  # Update 3 nodes at once

Or use percentage:

maxUnavailable: 10%  # 10% of nodes

Node Not Getting Updated

Check node selectors and taints:

# Check DaemonSet node selector
kubectl get daemonset fluent-bit -o jsonpath='{.spec.template.spec.nodeSelector}'

# Check node labels
kubectl get nodes --show-labels

# Check node taints
kubectl describe nodes | grep Taints

Event Sources

Control how Headwind detects updates for this DaemonSet:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: node-agent
  annotations:
    headwind.sh/policy: "patch"
    # Use webhooks (default, fastest)
    headwind.sh/event-source: "webhook"

    # Or use polling
    # headwind.sh/event-source: "polling"
    # headwind.sh/polling-interval: "900"  # Poll every 15 minutes

See Event Sources for detailed configuration options.

Next Steps

Update Policies - Understand semantic versioning policies
Configure Event Sources - Webhooks vs polling
Approval Workflow - Configure approval process
Rollback Configuration - Set up automatic rollback
Notifications - Configure Slack/Teams notifications

Why DaemonSets?​

Supported Annotations​

Basic Configuration​

Update Workflow​

Log Collection Example​

Network Plugin Example​

Monitoring Agent Example​

Security Agent Example​

Node-Specific DaemonSets​

Update Strategy Considerations​

Conservative Rolling Updates​

Faster Rolling Updates​

OnDelete Strategy​

Private Registry Support​

Monitoring Updates​

View Update History​

Check UpdateRequests​

Monitor Pod Updates​

Metrics​

Best Practices​

1. Conservative Policies for Critical Infrastructure​

2. More Permissive for Observability​

3. Enable Auto-Rollback​

4. Configure Proper Update Strategy​

5. Use Health Checks​

6. Test Node Coverage​

7. Environment-Specific Policies​

Troubleshooting​

DaemonSet Not Updating​

Pod Stuck on Node​

Update Too Slow​

Node Not Getting Updated​

Event Sources​

Next Steps​