Control Plane and worker nodes on vSphere Kubernetes Service when running with Ubuntu 22.04 and 24.04 are confirmed to be affected by CVE-2026-31431 (“Copy Fail”) under certain circumstances only. Nodes running Photon OS are not affected.
vSphere Kubernetes Service
A local privilege escalation (LPE) vulnerability affecting the Linux kernel was publicly disclosed on April 29, 2026. The vulnerability has been assigned CVE-2026-31431 and is referred to as Copy Fail. The affected component is the kernel module algif_aead that provides hardware-accelerated cryptographic functions.
No software package in vSphere Kubernetes Service uses algif_aead, and Broadcom is not aware of any container workloads that are using mentioned kernel module at the present time. The module is most frequently used for wireless connectivity and strongswan. Even when using Antrea with IPSec encryption enabled, AEAD is handled in userspace via openssl for ESP, and the kernel module is not loaded.
Before applying below mitigation, ensure deployed applications do not depend on the kernel module algif_aead.
Note: Socket use for AF_ALG tends to be transient and that sockets are only opened for a very short period of time. Nevertheless, it may be possible to see an entry when running the following:
sudo lsof -nP 2>/dev/null | awk 'NR==1 || /protocol: ALG/'
If applications are identified using the kernel module algif_aead, it may be needed to contact the vendor to find an alternative or wait for the updated kernel patches.
New VKRs will be published in due course with the following Ubuntu package versions:
| Ubuntu Release | Linux Kernel Version | kmod package version |
| Ubuntu 22.04 | To be determined | 29-1ubuntu1.1 |
| Ubuntu 24.04 | To be determined | 31+20240202-2ubuntu7.2 |
NOTE: Broadcom strongly recommends testing this mitigation in non-production environment first. Review above "Possible Impact on Deployed Applications" carefully.
To patch an existing workload cluster with the manual mitigation, it is required to identify current pause image, used by the cluster subject to be patched.
chmod +x generate-daemonset-yaml-disable-algif_aead.sh
./generate-daemonset-yaml-disable-algif_aead.sh --namespace <namespace> --cluster <workload-cluster>
kubectl apply -f copyfail-mitigation.yaml
This has to be completed for all workload clusters which should be mitigated.
Deleting the DaemonSet from the workload cluster stops the mitigation from being applied on its new nodes. However, it does not remove the /etc/modprobe.d/manual-disable-algif_aead.conf file already written to each node. This file will persist until explicitly deleted, or the VKS node got re-provisioned.
kubectl delete daemonset disable-algif-aead -n kube-system
Under normal operation, the VKS nodes will not need a reboot. A reboot is only recommended if the kernel module is loaded and hence was or is actively used by an application.
grep -qE '^algif_aead ' /proc/modules && echo "Affected module is loaded" || echo "Affected module is NOT loaded"
While it is possible to attempt unloading the relevant kernel module using rmmod, we do not recommend this approach if avoidable. Force-unloading kernel modules, which are actively in use, may negatively impact the kernel stability and/or applications relying on this module.
Hence, Broadcom recommends to apply above mitigation and reboot the node. However, if the kernel module is required by any workload, review above section "Possible Impact on Deployed Applications".
If a reboot is not immediately possible, it can be attempted to live unload with rmmod on nodes where the mitigation was applied:
sudo rmmod algif_aead
Note: If it is chosen to reboot nodes, it is highly recommended to cordon and drain nodes prior to their reboot. This is to reduce any disruptions to running workload.
Before rebooting any node, cordon it with kubectl cordon to stop new pods being scheduled, then drain it with kubectl drain --ignore-daemonsets --delete-emptydir-data to evict running workloads gracefully. Draining honors PodDisruptionBudgets and gives stateful workloads (etcd members, databases, message brokers) the chance to fail over or flush state cleanly; skipping this risks data loss, split-brain, and avoidable downtime.
Confirm the drain has completed and the node reports SchedulingDisabled before issuing the reboot. Reboot one node at a time and wait for it to return to Ready before moving on - this is especially important for control plane nodes, where etcd quorum (n/2 + 1 members) must be preserved throughout. Once the node is healthy, uncordon it with kubectl uncordon so workloads can be scheduled again.
The snippet below iterates over all nodes via kubectl debug node/<name>, reads each node's /proc/modules through the host mount, and prints a single line per node. Nodes reporting REBOOT NEEDED still have algif_aead loaded in the running kernel; the blocklist will prevent future loads but the running instance can only be cleared by rmmod or a reboot as recommended above.
for node in $(kubectl get nodes -o name); do
name="${node#node/}"
result=$(kubectl debug "$node" -q --image=busybox -- \
chroot /host sh -c 'grep -qE "^algif_aead " /proc/modules && echo LOADED || echo OK' \
2>/dev/null | tail -n1)
case "${result:-ERROR}" in
LOADED) printf '%-50s %s\n' "$name" "REBOOT NEEDED (algif_aead loaded)" ;;
OK) printf '%-50s %s\n' "$name" "OK" ;;
*) printf '%-50s %s\n' "$name" "ERROR: could not determine state" ;;
esac
done
Any node printed as REBOOT NEEDED should be cordoned, drained, and rebooted following the rebooting guidance above. Debug pods are short-lived and will be cleaned up automatically; if the environment restricts kubectl debug, an equivalent check can be run via SSH or any internal automation tools.
Once new VKR releases do contain an appropriate fix, the VKS nodes should be updated. Until then, below snippet can be used to confirm every node in a cluster has the blocklist file in place. It uses kubectl debug node/<name> to spawn a privileged debug pod on each node and check for the blocklist configuration file:
for node in $(kubectl get nodes -o name); do
name="${node#node/}"
result=$(kubectl debug "$node" -q --image=busybox -- \
chroot /host sh -c 'test -f /etc/modprobe.d/manual-disable-algif_aead.conf && echo OK || echo MISSING' \
2>/dev/null | tail -n1)
printf '%-50s %s\n' "$name" "${result:-ERROR}"
done
Nodes reporting MISSING have not had the mitigation applied - re-run the DaemonSet, or apply the manual blocklist directly. Debug pods are short-lived and will be cleaned up automatically; if the environment restricts kubectl debug, an equivalent check can be run via SSH or any internal automation tools.
More information about the security vulnerability and impacted VCF products are provided in KB Impact Evaluation of CVE‑2026‑31431 ("Copy Fail") of VMware by Broadcom product portfolio.
Should you require further information or support, contact Broadcom Support.
To be notified on any changes, subscribe to this knowledge base article.