Appendix A: Command-Line And Inspection Tools

For every tool here the concern is the same: which flags matter for microVM work, what the tool actually measures, and where the numbers come from in the kernel or the process. If you have ever run a KVM_RUN loop and wondered why kvm_stat was showing more exits than you expected, or typed lscpu on a guest and wondered why it reported a different hypervisor than you knew was there, read those entries first.

flowchart TD A["/dev/kvm"] --> B["KVM API\n(VM fd, vCPU fd)"] B --> C["firecracker / jailer\nqemu-system-x86_64"] D["firectl\n(optional wrapper)"] --> C B --> E["kvm_stat\n(debugfs or tracepoints)"] B --> F["perf kvm\n(perf_event_open)"] G["/proc/cpuinfo\n/sys/module/kvm*/parameters"] --> H["lscpu / lsmod\ncpuid"]

firecracker

When you type firecracker on the command line you are starting the microVM monitor. The process runs three thread types: an API thread that runs an HTTP/1.1 server over the Unix-domain socket; a VMM thread that handles device emulation and the microVM metadata service (MMDS); and one vCPU thread per vcpu_count, each executing KVM_RUN in a tight loop. "Three threads" is the minimum, valid only for a 1-vCPU VM; a VM with N vCPUs has N+2 threads total. Seccomp filters are applied per-thread before the first guest instruction executes — a filter on the vCPU thread allows KVM_RUN but not the broader set of syscalls the API thread requires.

On a production host the jailer wraps firecracker and constructs the chroot, cgroups, and privilege drop before execing into the binary. Running firecracker directly, without the jailer, is appropriate for lab sessions and is the pattern used throughout this book's examples.

The flag that matters most in day-to-day use is --api-sock. The getting-started guide uses /tmp/firecracker.socket as a conventional override; the binary defaults to /run/firecracker.socket. Every subsequent curl --unix-socket call targets the path you give here, so agree on it before the first API call.

--config-file accepts a JSON file containing the complete microVM configuration — boot source, drives, network interfaces, machine config — and causes Firecracker to configure itself before exposing the socket. Combined with --no-api, which disables the HTTP server entirely, these two flags form the "batch mode" path used in production when a higher-level orchestrator owns the VM lifecycle. The two flags are interdependent: --no-api without --config-file will not start.

--id sets the microVM identifier. It appears in every log line and defaults to anonymous-instance, which is acceptable in a lab but collides if you run multiple VMs in parallel on the same host.

Logging is split across two flags: --log-path names a FIFO or file, and --level selects the verbosity. Valid values for --level are Error, Warning, Info, Debug, and Trace. --show-level and --show-log-origin prepend the level and the Rust source location (module + line number) to each log line. For debugging device emulation failures, --show-log-origin is often the fastest way to find the responsible code path without reading the full source.

The --enable-pci flag switches the virtio transport from the default virtio-mmio to virtio-pci. This was added in v1.13.0 and is not needed for the standard microVM device set (virtio-net, virtio-block, vsock), but matters when a guest driver only speaks PCI.

Two flags appear in process trees managed by the jailer — --start-time-us and --start-time-cpu-us — and should not be set by hand. The jailer injects them to carry wall-clock and CPU-clock timestamps from before the exec, which Firecracker uses to measure the true end-to-end boot latency.

The REST API

Communication with a running Firecracker process is plain HTTP/1.1 over the Unix-domain socket, addressed with curl --unix-socket <path>. Pre-boot configuration endpoints use PUT and return HTTP 204 on success. The two required calls before InstanceStart are /boot-source and /machine-config:

# The firecracker process must have been started with /dev/kvm access.
# The curl calls themselves require no privilege.

curl --unix-socket /tmp/firecracker.socket -i \
  -X PUT "http://localhost/boot-source" \
  -H "Accept: application/json" \
  -H "Content-Type: application/json" \
  -d '{
    "kernel_image_path": "./vmlinux",
    "boot_args": "console=ttyS0 reboot=k panic=1 pci=off nomodules"
  }'

curl --unix-socket /tmp/firecracker.socket -i \
  -X PUT "http://localhost/machine-config" \
  -H "Content-Type: application/json" \
  -d '{"vcpu_count": 2, "mem_size_mib": 512}'

On aarch64, prepend keep_bootcon to boot_args. The boot arguments shown (console=ttyS0 reboot=k panic=1 pci=off nomodules) come from the getting-started guide and reflect the minimal kernel command line for the default virtio-mmio microVM configuration: no PCI bus, no modules, panic on reboot rather than spinning.

The full set of pre-boot configuration endpoints:

Path Required fields
PUT /boot-source kernel_image_path; optional boot_args, initrd_path
PUT /drives/{drive_id} drive_id, is_root_device
PUT /machine-config vcpu_count, mem_size_mib
PUT /network-interfaces/{iface_id} iface_id, host_dev_name
PUT /logger log_path; optional level, show_level, show_log_origin
PUT /metrics metrics_path
PUT /mmds/config MMDS configuration; imds_compat field added in v1.13.0
PUT /actions action_type: InstanceStart, FlushMetrics, or SendCtrlAltDel

Post-boot, the most operationally useful endpoint is PATCH /vm with body {"state": "Paused"} or {"state": "Resumed"}, which suspends and resumes the VM without teardown. Snapshots go through PUT /snapshot/create (requiring snapshot_type, snapshot_path, and mem_file_path) and PUT /snapshot/load (which accepts backend_type: "File" or "Uffd" in the mem_backend object for userfaultfd-backed restoration). GET /vm/config exports the complete current configuration as JSON.

Performance Guarantees

Firecracker's SPECIFICATION.md commits to two numbers that appear throughout this book. Boot time from InstanceStart to guest /sbin/init is at most 125 ms, measured on M5D.metal and M6G.metal instances with serial console disabled, a minimal kernel, and a minimal root filesystem. Memory overhead per microVM (1 vCPU, 128 MiB RAM) is at most 5 MiB. Firecracker was open-sourced in November 2018.

Supported Kernel Versions

Firecracker's docs/kernel-policy.md guarantees at least two supported host kernels and two supported guest kernels simultaneously. The current table:

Kernel Role Min Firecracker version Support end
5.10 Host + Guest v1.0.0 2024-01-31
6.1 Host v1.5.0 2025-10-12
6.1 Guest v1.9.0 2026-09-02
6.18 Host v1.16.0 2028-05-28

Versions not in this table may work but are not validated in CI.


jailer

The jailer ships alongside firecracker and is the required wrapper for production deployments. Its job is to construct an isolation envelope around the VMM process before any guest code executes: a chroot that hides the host filesystem, cgroups that bound CPU and memory, a uid/gid drop that removes root privilege, and optional network and PID namespace isolation. It then execs into the firecracker binary. By the time the first KVM_RUN executes, the VMM process has no path back to the host filesystem and no privilege it did not enter the chroot with.

Safety note. The jailer modifies the cgroup hierarchy, creates device nodes inside the chroot, and drops privileges with setuid/setgid. Run it as root. The --uid and --gid targets must be a non-root account; using uid 0 defeats the purpose.

Four flags are required on every invocation:

/usr/bin/jailer \
  --id 551e7604-e35c-42b3-b825-416853441234 \
  --exec-file /usr/bin/firecracker \
  --uid 123 --gid 100 \
  --cgroup cpuset.cpus=0-3 \
  --cgroup cpuset.mems=0 \
  --netns /var/run/netns/my_netns \
  --daemonize

--id must be alphanumeric plus hyphens, maximum 64 characters. It becomes the cgroup name and the subdirectory under --chroot-base-dir. The --exec-file path must point to a statically linked Firecracker binary; the musl toolchain produces the expected artifact. --uid and --gid are numeric and become the drop targets.

The chroot lands at <chroot-base-dir>/<exec_file_name>/<id>/root. The default --chroot-base-dir is /srv/jailer. The jailer copies the binary into the chroot, creates /dev/net/tun and /dev/kvm device nodes inside it, and changes ownership of all resources to uid:gid before executing.

--cgroup is repeatable and writes a key=value pair into the appropriate cgroup file. --cgroup-version selects between the v1 and v2 hierarchy (default v1). --parent-cgroup overrides the parent cgroup path (default: the exec file name).

--netns takes the path of an existing network namespace file descriptor and causes the jailer to join that namespace before exec. --new-pid-ns wraps the Firecracker process in a new PID namespace via CLONE_NEWPID and writes the child PID into a file named <exec_file_name>.pid in the chroot. These two flags are independent and can be combined.

--daemonize calls setsid() and redirects stdin/stdout/stderr to /dev/null.

--resource-limit sets setrlimit(2) values. The default is no-file=2048; fsize (maximum file size in bytes) is also supported. Pass additional invocations of the flag to set multiple limits.

Any flags placed after -- on the jailer command line are appended verbatim to the firecracker argv. This is how you pass --level Debug or a custom --seccomp-filter without the jailer interpreting them.

The Fifteen-Step Execution Sequence

The jailer's execution sequence matters because it determines what is visible at each isolation boundary. In order:

  1. Validate paths and VM ID.
  2. Close all inherited file descriptors except stdin, stdout, and stderr.
  3. Clear environment variables.
  4. Create the chroot directory at <base>/<exec_file_name>/<id>/root.
  5. Copy the Firecracker binary into the chroot.
  6. Apply setrlimit(2) resource limits.
  7. Create cgroup subdirectories and write parameter values.
  8. Enter a new mount namespace; pivot_root(2) to the chroot.
  9. Create /dev/net/tun and /dev/kvm device nodes inside the chroot.
  10. Change ownership of all resources to uid:gid.
  11. Join the network namespace if --netns was given.
  12. Daemonize if --daemonize was given.
  13. Clone into a new PID namespace if --new-pid-ns was given.
  14. Drop privileges to uid:gid.
  15. exec(2) into the Firecracker binary.

The sequence is not arbitrary. Device nodes are created at step 9, after pivot_root has locked the process into the chroot but before privileges are dropped at step 14, because creating device nodes requires elevated privilege and must target the new root. The network namespace join happens at step 11, after device nodes, so that tun is visible in the namespace the process actually runs in.


firectl

firectl is a Go tool built on firecracker-go-sdk that wraps both firecracker and, optionally, jailer. Its flag parsing uses github.com/jessevdk/go-flags. It translates a single command line into the sequence of REST API calls that configure and start a VM, making it useful for quick lab sessions where you want to avoid writing the curl sequence by hand. Production deployments typically drive the REST API directly or through a dedicated orchestrator; firectl is an inspection and experimentation tool.

A minimal invocation:

firectl \
  --kernel=./vmlinux \
  --root-drive=./rootfs.ext4 \
  --ncpus=2 --memory=512 \
  --tap-device=tap0/AA:FC:00:00:00:01

The flags that determine the VM configuration map directly onto the REST API fields above. --kernel corresponds to kernel_image_path in /boot-source, defaulting to ./vmlinux. --kernel-opts sets boot_args; the default is ro console=ttyS0 noapic reboot=k panic=1 pci=off nomodules. --root-drive is required; the optional :ro or :rw suffix controls the drive's is_read_only field. Additional drives use --add-drive, which is repeatable.

Networking uses --tap-device in the form <device>/<mac>; the flag is repeatable for multiple interfaces. Vsock uses --vsock-device <path>:<cid>.

-c / --ncpus sets vcpu_count (default 1). -m / --memory sets mem_size_mib (default 512). --cpu-template accepts C3 or T2, which are x86-only (Intel) static CPU templates that Firecracker applies through the /machine-config endpoint to produce a consistent CPUID view for snapshot portability. From v1.13.0 onward, custom CPU templates are also available via the REST API's cpu_config key.

Log handling: --vmm-log-fifo names a FIFO for VMM log output, --log-level sets the level (default Debug), and -l / --firecracker-log redirects the FIFO's contents to a file. --metrics-fifo names a FIFO for the Firecracker metrics JSON.

--metadata accepts a JSON blob passed to MMDS, equivalent to configuring and populating the metadata service via the REST API.

Jailer integration is opt-in: --jailer specifies the jailer binary path, and --uid, --gid, --id, --node (NUMA node), --chroot-base-dir, and --daemonize pass through to the jailer. Without --jailer, firectl invokes firecracker directly.


qemu-system-x86_64 with -M microvm

The microvm machine type was introduced in QEMU 4.2.0, released 2019-12-13. Its design statement in the QEMU documentation is precise: "microvm is a machine type inspired by Firecracker and constructed after its machine model." Like Firecracker, it has no PCI bus by default, no ACPI, and no device hotplug. It does not support live migration across QEMU versions.

The book reaches for QEMU when comparing Firecracker design choices or running inspection experiments that need a standard Linux virtio stack — for example, reading info virtio-status through the QEMU monitor to observe what Firecracker hides behind its device emulation layer.

Machine Type And Transport

The microvm machine type uses virtio-mmio exclusively. The source file hw/i386/microvm.c configures a maximum of eight virtio-mmio transports by default. An ISA bus is present; the legacy devices on it — the i8259 PIC, i8254 PIT, MC146818 RTC, and ISA serial port — are each conditionally instantiated through machine properties. kvmclock and fw_cfg are supported. The default BIOS is qboot, chosen for reduced boot time; SeaBIOS is also compatible. No firmware currently supports booting from a virtio-mmio block device, so a host kernel must always be supplied via -kernel.

Safety note. -enable-kvm requires access to /dev/kvm. Run as a user in the kvm group or as root. On hosts where the KVM module is not loaded, omit -enable-kvm and expect a ten- to hundred-fold increase in boot time due to software emulation.

Machine-specific options are passed as comma-separated key=value pairs after -M microvm:

Option Effect
x-option-roms=off Disable option ROM loading
pit=off Disable i8254 PIT
pic=off Disable i8259 PIC
rtc=off Disable MC146818 RTC
isa-serial=off Disable ISA serial port
auto-kernel-cmdline=on Auto-append virtio-mmio entries to kernel cmdline

Canonical Invocations

The standard configuration uses the ISA serial port for the console:

# Requires /dev/kvm access.
qemu-system-x86_64 -M microvm \
  -enable-kvm -cpu host -m 512m -smp 2 \
  -kernel vmlinux -append "earlyprintk=ttyS0 console=ttyS0 root=/dev/vda" \
  -nodefaults -no-user-config -nographic \
  -serial stdio \
  -drive id=test,file=test.img,format=raw,if=none \
  -device virtio-blk-device,drive=test \
  -netdev tap,id=tap0,script=no,downscript=no \
  -device virtio-net-device,netdev=tap0

The minimal-footprint configuration disables all legacy ISA devices and uses a virtio-serial console instead:

# Requires /dev/kvm access.
qemu-system-x86_64 \
  -M microvm,x-option-roms=off,pit=off,pic=off,isa-serial=off,rtc=off \
  -enable-kvm -cpu host -m 512m -smp 2 \
  -kernel vmlinux -append "console=hvc0 root=/dev/vda" \
  -nodefaults -no-user-config -nographic \
  -chardev stdio,id=virtiocon0 \
  -device virtio-serial-device \
  -device virtconsole,chardev=virtiocon0 \
  -drive id=test,file=test.img,format=raw,if=none \
  -device virtio-blk-device,drive=test \
  -netdev tap,id=tap0,script=no,downscript=no \
  -device virtio-net-device,netdev=tap0

-nodefaults suppresses the default set of devices QEMU would otherwise create (VGA, sound, USB) — essential for a microVM profile. -no-user-config prevents QEMU from reading per-user configuration files. -cpu host passes through the host's CPU feature flags, enabling the guest to see the same virtualization feature bits that lscpu and cpuid show on the host.

HMP: The Human Monitor Protocol

QEMU can multiplex the serial console and an interactive monitor over the same terminal using the mux chardev form. The canonical invocation above uses -serial stdio, which does not enable mux mode; to get Ctrl-a switching between the monitor and console, use -serial mon:stdio instead. The escape key sequences:

Sequence Effect
Ctrl-a c Switch between monitor and console (requires -serial mon:stdio)
Ctrl-a h Show help
Ctrl-a x Kill the emulator
Ctrl-a s Sync disk (snapshot mode)
Ctrl-a b Send break / magic SysRq
Ctrl-a Ctrl-a Send literal Ctrl-a to guest

You can also attach the monitor to a separate channel: -monitor unix:/path,server,nowait or -monitor telnet::4444,server,nowait.

The HMP info subcommands most useful for KVM and virtio inspection:

System control: quit, stop, cont, system_reset, system_powerdown.

QMP: The QEMU Machine Protocol

QMP uses a JSON wire format: UTF-8 in, ASCII out, objects terminated by CRLF. Enable it with -qmp unix:/path/to/sock,server,nowait or -qmp tcp:127.0.0.1:4444,server,nowait. On connect, QEMU sends a greeting:

{"QMP": {"version": {"qemu": {"micro": 0, "minor": 0, "major": 3}, "package": "v3.0.0"}, "capabilities": ["oob"]}}

The client must issue {"execute": "qmp_capabilities"} before any other command; omitting it causes every subsequent command to return CommandNotFound. The oob capability in the greeting enables out-of-band execution that bypasses the normal command queue. QMP is the interface used by libvirt and other automation layers; for one-off inspection in a lab session, HMP is faster.


kvm_stat

When a VM exit occurs, the hardware stops the guest and hands control to KVM. KVM classifies the exit by reason — CPUID, HLT, IO_INSTRUCTION, EPT_VIOLATION, and so on — and increments a per-reason counter. kvm_stat reads those counters and displays a rolling table of exit counts, one row per exit reason, refreshed at an adjustable interval. It is, as its name suggests, vmstat for VM exits.

The tool is a Python 3 script, shipped at tools/kvm/kvm_stat/kvm_stat in the Linux kernel source tree. Distributions package it separately; on Ubuntu, the package is linux-tools-host (present since at least kernel version 4.15.0-19.20). Do not look for it in the QEMU tree — it moved into the kernel tree.

Safety note. Both of kvm_stat's data source modes require elevated privilege. Debugfs mode requires mounting debugfs and reading files under /sys/kernel/debug/kvm/, which requires root or CAP_SYS_ADMIN. Tracepoint mode calls perf_event_open(2) with PERF_TYPE_TRACEPOINT, which also requires elevated privilege on most kernel configurations.

Data Source Modes

kvm_stat has two modes for reading exit counters, selectable by flag:

Debugfs mode (-d / --debugfs) reads pseudo-files under /sys/kernel/debug/kvm/. Each running VM gets a subdirectory named {pid}-{fd}, where pid is the VMM process ID and fd is the file descriptor number of the KVM VM fd returned by KVM_CREATE_VM. Each exit reason is a file in that directory; reading it returns a cumulative count. Debugfs must be mounted:

# Requires root.
mount -t debugfs none /sys/kernel/debug

Tracepoint mode (-t / --tracepoints) reads from kernel tracepoints under /sys/kernel/debug/tracing/events/kvm/ using perf_event_open(2) with PERF_TYPE_TRACEPOINT and PERF_FORMAT_GROUP. This mode sees exits from all VMs on the host simultaneously and does not require per-VM fd knowledge, but it cannot filter to a single VM by fd — only by guest name (-g) or PID (-p).

Key Flags

-1 / --once / --batch produces a single-pass non-interactive output, suitable for piping into scripts. -f <regex> / --fields filters the displayed counters by regex — useful when you want to watch only EXIT_REASON_EPT_VIOLATION rows without the noise of CPUID and HLT. -p <pid> restricts to one VM by PID. -s / --set-delay adjusts the refresh interval (0.1–25.5 seconds). -z / --skip-zero-records suppresses all-zero rows in log mode.

Exit Reason Dictionaries

kvm_stat ships exit reason tables for VMX (Intel), SVM (AMD), and AArch64 inside the Python source. Selected entries from each:

VMX (Intel) — from VMX_EXIT_REASONS in the script:

Code Name
0 EXCEPTION_NMI
1 EXTERNAL_INTERRUPT
2 TRIPLE_FAULT
10 CPUID
12 HLT
75 NOTIFY

The NOTIFY exit (code 75) is the newest entry and signals a notify VM exit — a mechanism added to handle the case where the guest has been non-cooperative for too long without causing a conventional exit.

SVM (AMD) — from SVM_EXIT_REASONS in the script:

Code (hex) Name
0x000 READ_CR0
0x010 WRITE_CR0
0x400 NPF — Nested Page Fault
0x401 AVIC_INCOMPLETE_IPI
0x402 AVIC_UNACCELERATED_ACCESS
0x403 VMGEXIT

The NPF exit (0x400) is the AMD SVM equivalent of Intel's EPT_VIOLATION. VMGEXIT (0x403) appears when SEV-ES guests execute the VMGEXIT instruction to communicate with the hypervisor — an exit reason you will not see on a non-confidential microVM workload.


perf kvm

perf kvm is a perf subcommand for KVM-specific profiling. Where kvm_stat shows aggregate exit counts, perf kvm stat shows exit latency: how long each exit reason keeps the vCPU off the guest, broken down by min, max, and mean across all samples.

The subcommand has six sub-subcommands: top, record, report, diff, buildid-list, and stat. The stat variant further divides into stat record, stat report, and stat live. In practice, the three-step workflow is: record to a file, then report from it, or use stat live for an immediate rolling view.

Safety note. perf kvm calls perf_event_open(2) and reads from kernel tracepoints. It requires root or CAP_PERFMON (Linux 5.8+) / CAP_SYS_ADMIN on older kernels. The --guest flag additionally requires debugfs access for guest symbol resolution.

perf kvm stat

On x86, the --event= flag selects the exit type to analyze. The supported values are vmexit (the default, supported on all architectures), mmio (x86 only), and ioport (x86 only). For a microVM workload on an x86 host, vmexit is the starting point; ioport narrows to the I/O port exits that Firecracker sees during early boot.

# Requires root or CAP_PERFMON.
perf kvm stat record -a -- sleep 30
perf kvm stat report --event=vmexit

The report output columns are sample, percent_sample, time, percent_time, max_t, min_t, and mean_t. The --key flag sorts by any of those column names (sample is the default). --duration=<value> filters to only exits that took longer than the given threshold in microseconds — useful for isolating pathological exits from the background noise of CPUID and HLT. --vcpu=<n> restricts analysis to a specific vCPU index.

Live mode skips the file:

# Requires root or CAP_PERFMON.
perf kvm stat live --event=vmexit

Live mode supports only two sort keys: sample (default) and time.

Recording With Guest Symbols

When a VM exit lands in a guest kernel function, perf kvm can resolve the exit's originating address to a symbol name if you supply the guest's kallsyms and module table. Record from inside the guest with:

# Requires root on the host; guest files must be available on the host.
perf kvm --host --guest \
  --guestkallsyms=/tmp/guest.kallsyms \
  --guestmodules=/tmp/guest.modules \
  record -a

Terminate the recording with SIGINT only — other signals corrupt the data file.

Underlying Tracepoints

perf kvm stat consumes kernel tracepoints in the kvm: subsystem, visible under /sys/kernel/debug/tracing/events/kvm/. The complete set includes kvm:kvm_entry, kvm:kvm_exit, kvm:kvm_hypercall, kvm:kvm_pio, kvm:kvm_cpuid, kvm:kvm_apic, kvm:kvm_inj_virq, kvm:kvm_page_fault, kvm:kvm_msr, kvm:kvm_cr, and kvm:kvm_mmio. These are the same tracepoints kvm_stat -t reads. To count all of them system-wide for a fixed interval without filtering:

# Requires root or CAP_PERFMON.
sudo perf stat -e 'kvm:*' -a sleep 1h

cpuid

The cpuid instruction is how software asks the CPU what it is and what it can do. A guest running under KVM gets a virtualized view: KVM intercepts cpuid exits (exit reason 10 in the VMX table above) and synthesizes responses according to what Firecracker or QEMU has configured. The cpuid userspace utility (available as the cpuid package on most distributions) issues the instruction directly from ring 3 and prints the raw register values. It is useful for two things in microVM work: confirming that hardware virtualization extensions are present on the host, and reading the KVM hypervisor vendor leaf to confirm what the guest sees.

The most useful invocation:

cpuid -l 0x40000000

This reads the hypervisor base leaf, which the hypervisor vendor range starts at. Physical CPUs ignore this range (returning zeros); all major hypervisors respond here with their vendor string.

Feature Bits On The Host

Leaf Register Bit Meaning
0x1 ECX 5 CPUID.01H:ECX[5] — Intel VMX (VT-x) supported
0x1 ECX 31 Hypervisor present; physical CPUs always return 0; all major hypervisors set it to 1
0x80000001 ECX 2 CPUID.80000001H:ECX[2] — AMD SVM; edk2 constant name SVM

The KVM Hypervisor CPUID Leaves

The range 0x40000000–0x4FFFFFFF is reserved for hypervisors by convention. KVM occupies two leaves:

Leaf 0x40000000 — KVM_CPUID_SIGNATURE: EAX returns the maximum hypervisor leaf (typically 0x40000001; old hosts may return 0x0, which should be treated as 0x40000001). EBX, ECX, and EDX together encode the 12-character ASCII vendor string. For KVM:

Register Value Meaning
EBX 0x4b4d564b Bytes 0–3: KVMK
ECX 0x564b4d56 Bytes 4–7: VMKV
EDX 0x4d Bytes 8–11: M\0\0\0

Concatenated: "KVMKVMKVM\0\0\0". For comparison, Hyper-V returns "Microsoft Hv", VMware returns "VMwareVMware", and Xen returns "XenVMMXenVMM". If you run cpuid -l 0x40000000 inside a Firecracker guest and see this string, KVM is confirmed as the underlying hypervisor regardless of what Firecracker presents as the paravirt interface.

Leaf 0x40000001 — KVM_CPUID_FEATURES: EAX bits advertise KVM paravirt features. The ones that appear in microVM work:

Bit Name Meaning
0 KVM_FEATURE_CLOCKSOURCE kvmclock via MSRs 0x11, 0x12
3 KVM_FEATURE_CLOCKSOURCE2 kvmclock via MSRs 0x4b564d00, 0x4b564d01
5 KVM_FEATURE_STEAL_TIME steal time accounting via MSR 0x4b564d03
24 KVM_FEATURE_CLOCKSOURCE_STABLE_BIT clocksource is stable across CPUs

EDX bit 0 is KVM_HINTS_REALTIME, which signals that vCPUs will not be preempted indefinitely — the hint a paravirt scheduler would act on to avoid unnecessary yield calls.


lscpu

lscpu is part of util-linux. It reads /proc/cpuinfo — not the CPUID instruction directly — to populate most of its output. Two fields are relevant for microVM work.

The Virtualization field shows VT-x or AMD-V. The implementation in sys-utils/lscpu-virt.c, in the function lscpu_read_virtualization(), scans the flags: line in /proc/cpuinfo for the substrings " vmx " and " svm ". Finding vmx prints VT-x; finding svm prints AMD-V. This means lscpu is reporting what the kernel advertises in /proc/cpuinfo, not what the CPUID instruction returned at boot — a distinction that matters inside a guest, where the kernel may have been given a modified CPUID view.

The Hypervisor vendor field is populated by read_hypervisor_cpuid() in the same file. That function queries CPUID leaf 0x40000000 and matches the 12-byte vendor string against the table described above: "KVMKVMKVM" maps to VIRT_VENDOR_KVM, "XenVMMXenVMM" to VIRT_VENDOR_XEN, "Microsoft Hv" to VIRT_VENDOR_MSHV, and "VMwareVMware" to VIRT_VENDOR_VMWARE. DMI tables are checked via read_hypervisor_dmi() as a fallback. Running lscpu inside a Firecracker guest will therefore show KVM as the hypervisor vendor, which is accurate — Firecracker is a VMM on top of KVM, not a standalone hypervisor.


lsmod And KVM Module Parameters

Three kernel modules implement KVM on x86: kvm (the common base, always required), kvm_intel (Intel VMX), and kvm_amd (AMD SVM). On a bare-metal host, exactly one of kvm_intel or kvm_amd is loaded alongside kvm. On a VM, all three may be absent if the guest kernel was not compiled with KVM support, or all three may be present if nested virtualization is enabled.

lsmod | grep kvm

On an Intel host running Firecracker, the expected output is two lines: kvm_intel (with kvm listed as a dependency) and kvm. The Used by count on the kvm line reflects the number of open /dev/kvm file descriptors, which corresponds to the number of running VMs plus any that still have a fd open.

Module Parameters

Module parameters are exposed at /sys/module/<module>/parameters/ and can be enumerated with modinfo -p kvm_intel or modinfo -p kvm_amd.

Safety note. Changing module parameters at runtime or via /etc/modprobe.d/ affects all VMs on the host. On a production host, treat these as host-wide configuration, not per-VM settings.

The parameters that affect microVM behavior:

/sys/module/kvm_intel/parameters/:

Parameter Description
nested Enable nested VMX. Set to Y.
ept Extended Page Tables (Intel SLAT); enabled by default.
enable_apicv APIC virtualization acceleration.
enable_shadow_vmcs Shadow VMCS support.

/sys/module/kvm_amd/parameters/:

Parameter Description
nested Enable nested SVM. Set to 1.
npt Nested Page Tables (AMD SLAT); default 1 (enabled) on 64-bit and 32-bit PAE.

To persist a change across reboots, write to /etc/modprobe.d/:

# Intel nested VMX (use a separate file per module to avoid overwriting):
echo "options kvm_intel nested=1" > /etc/modprobe.d/kvm-intel.conf
# AMD nested SVM:
echo "options kvm_amd nested=1"   > /etc/modprobe.d/kvm-amd.conf

Where the parameter permits runtime changes (check the modinfo output for the param_ops_* type; not all parameters are writable after module load):

# Requires root.
echo -n 1 > /sys/module/kvm_intel/parameters/nested

Confirm nested VMX is active after loading: cat /sys/module/kvm_intel/parameters/nested should print Y. The AMD equivalent path is /sys/module/kvm_amd/parameters/nested and prints 1 when enabled.


/proc And /sys Files That Expose KVM State

These pseudo-files are the lowest-level inspection surface. No additional tooling is required; all of them are readable (or writable, with appropriate privilege) from a shell.

Path What it shows
/proc/cpuinfo flag vmx Intel VT-x capable CPU
/proc/cpuinfo flag svm AMD-V capable CPU
/proc/cpuinfo flag hypervisor Running inside a hypervisor; corresponds to CPUID.01H:ECX[31]
/dev/kvm Character device opened by the VMM to create VMs; presence confirms the KVM module is loaded
/sys/kernel/debug/kvm/ Per-VM debugfs directory; each VM appears as {pid}-{fd}
/sys/kernel/debug/tracing/events/kvm/ KVM tracepoint event definitions; backing store for kvm_stat -t and perf kvm
/sys/module/kvm/parameters/ Core KVM module parameters
/sys/module/kvm_intel/parameters/ Intel VMX module parameters
/sys/module/kvm_amd/parameters/ AMD SVM module parameters

The hypervisor flag in /proc/cpuinfo corresponds directly to CPUID.01H:ECX[31]. Physical CPUs always return 0 at that bit. Every major hypervisor — KVM, Hyper-V, VMware, Xen — sets it to 1 so that guest software can detect the environment without knowing the specific vendor. A guest kernel uses this bit to decide whether to activate paravirt clock sources, balloon drivers, and other host-cooperative features.

/dev/kvm is a character device with mode 0660, owned root:kvm. Its major number is dynamically assigned and visible in /proc/devices. Any process in the kvm group can open it. The file's presence confirms the KVM module is loaded; its absence means either the module was not loaded or the CPU lacks hardware virtualization support — dmesg | grep kvm will distinguish the two.

Safety note. /sys/kernel/debug/ requires either root or CAP_SYS_ADMIN, and requires debugfs to be mounted (mount -t debugfs none /sys/kernel/debug). On hosts where debugfs is mounted with hidepid or other restrictions, per-VM directories under /sys/kernel/debug/kvm/ may not be visible to non-root users even if /dev/kvm is accessible. All commands reading these paths in the book's lab examples should be run inside the lab VM or as root on the host, not as an unprivileged user on a development workstation.


Sources And Further Reading