2025-10-01 –, Main
Going for minimal containers with restricted system calls and unprivileged users is the usual Kubernetes approach these days, and it works great for most web apps. However, the development of more complex infrastructure extensions frequently hinders application functionality.
While looking for a solution to deploy virtiofsd in an unprivileged container for KubeVirt, we stumbled on seccomp notifiers. Seccomp notifiers are a kernel feature which monitors syscalls and get notifications to a userspace application when a syscall is executed.
Alternative options involved either the use of a custom protocol using UNIX sockets or the deployment of virtiofs as a privileged component alongside the unprivileged VM.
After our evaluation, the seccomp notifier turned out to be the simplest solution among all the choices. Unfortunately, the main constraint is the monitor's resilience after a restart, such as after a crash or an upgrade. This limitation forced us to back up to one of the less elegant approaches. But there is hope how this could be solved!
The session will explain why seccomp notifiers are a lean solution to avoid extra userspace communication and synchronization, the current limitations and possible future solutions to overcome today’s challenges.
Our experience will teach audiences several methods for dividing their privileged infrastructure. Utilizing virtiofsd as an actual example and a target application for KubeVirt integration and deployment. We will discuss the difficulties of using rootless containers in this session, as well as the design patterns, technologies, and tactics we thought about and ultimately chose to maintain or reject.
Alice, she is a Principal Software Engineer with expertise in Virtualization, Containers, and Kubernetes, and a KubeVirt maintainer. She recently joined the CoreOS team.
I'm on Red Hat's Virtualization and Storage team. I maintain virtiofsd and created podman-bootc, constantly exploring and contributing to projects I find fascinating.