Skip to content

fix(seccomp): add Linux AIO syscalls for UFFD+hugepages snapshot restore#11

Closed
tomassrnka wants to merge 1 commit intomainfrom
fix/seccomp-aio-syscalls
Closed

fix(seccomp): add Linux AIO syscalls for UFFD+hugepages snapshot restore#11
tomassrnka wants to merge 1 commit intomainfrom
fix/seccomp-aio-syscalls

Conversation

@tomassrnka
Copy link
Copy Markdown
Member

Summary

  • Add io_setup, io_destroy, io_submit, io_getevents to the VMM thread seccomp filter on both aarch64 and x86_64

Problem

During UFFD+hugepages snapshot restore on aarch64, the kernel's async I/O subsystem calls io_setup (syscall 0 on aarch64) which is not in the seccomp allowlist. This causes:

  • SIGSYS (signal 31) delivered to the fc_vmm thread
  • Firecracker exits with code 148 (BadSyscall)
  • The issue is intermittent — only occurs during template builds that use UFFD with hugepages

Root Cause

Confirmed via bpftrace on an ARM64 host:

bpftrace -e 'tracepoint:signal:signal_generate
  /args->sig == 31 && comm == "fc_vmm"/ {
    printf("SIGSYS for fc_vmm pid=%d, syscall=%d\n",
      pid, args->errno);
  }'

Output: SIGSYS for fc_vmm pid=XXXX, syscall=0 — syscall 0 on aarch64 is io_setup.

The modern io_uring_* syscalls are already in the filter, but the legacy Linux AIO syscalls (io_setup/io_destroy/io_submit/io_getevents) are missing. Something in the UFFD hugepages code path triggers the legacy AIO API on aarch64.

Changes

Added to vmm thread filter in both aarch64-unknown-linux-musl.json and x86_64-unknown-linux-musl.json:

  • io_setup — initialize AIO context
  • io_destroy — cleanup AIO context
  • io_submit — submit AIO requests
  • io_getevents — wait for AIO completion

Test plan

  • Build Firecracker for aarch64 with updated seccomp filter
  • Run UFFD+hugepages template build on ARM64 — should no longer hit BadSyscall
  • Verify x86_64 builds still work (syscalls added for consistency)

🤖 Generated with Claude Code

Add io_setup, io_destroy, io_submit, and io_getevents to the VMM
thread seccomp filter on both aarch64 and x86_64.

During UFFD+hugepages snapshot restore on aarch64, the kernel's async
I/O subsystem calls io_setup (syscall 0 on aarch64) which is not in
the seccomp allowlist, causing SIGSYS and Firecracker exit code 148
(BadSyscall). This was confirmed via bpftrace:

  bpftrace -e 'tracepoint:signal:signal_generate
    /args->sig == 31 && comm == "fc_vmm"/ {
      printf("SIGSYS for fc_vmm pid=%d, syscall=%d\n",
        pid, args->errno);
    }'

The issue is intermittent and only manifests during template builds
that use UFFD with hugepages for snapshot restore.

Adding the complete set of Linux AIO syscalls (io_setup, io_destroy,
io_submit, io_getevents) to both architectures for consistency, even
though the issue has only been observed on aarch64.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@cursor
Copy link
Copy Markdown

cursor Bot commented Apr 16, 2026

PR Summary

Medium Risk
This relaxes the VMM seccomp sandbox by allowing additional kernel AIO syscalls, which slightly increases the attack surface even though the change is narrow and scoped to the VMM filter.

Overview
Prevents intermittent BadSyscall/SIGSYS during UFFD+hugepages snapshot restore by expanding the VMM-thread seccomp allowlist.

Both resources/seccomp/aarch64-unknown-linux-musl.json and resources/seccomp/x86_64-unknown-linux-musl.json now allow the legacy Linux AIO syscalls io_setup, io_destroy, io_submit, and io_getevents (with comments clarifying their snapshot-restore usage).

Reviewed by Cursor Bugbot for commit 7a6837c. Bugbot is set up for automated code reviews on this repo. Configure here.

@tomassrnka tomassrnka closed this Apr 16, 2026
@ValentaTomas ValentaTomas deleted the fix/seccomp-aio-syscalls branch April 30, 2026 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant